11 years agokconfig: made check-lxdialog more portable
Sam Ravnborg [Thu, 1 May 2008 17:29:47 +0000]
kconfig: made check-lxdialog more portable

OS-X shell did not like 'echo -e' so implement
suggestion from Al Viro to use a more portable construct.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Acked-By: Timur Tabi <timur@freescale.com>

11 years agoUpdate .gitignore to include include/linux/bounds.h
Theodore Ts'o [Thu, 1 May 2008 01:55:48 +0000]
Update .gitignore to include include/linux/bounds.h

(which is autogenerated by kbuild)

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Thu, 1 May 2008 03:13:22 +0000]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  ipv6: Compilation fix for compat MCAST_MSFILTER sockopts.

11 years agoFix dnotify/close race
Al Viro [Thu, 1 May 2008 02:52:22 +0000]
Fix dnotify/close race

We have a race between fcntl() and close() that can lead to
dnotify_struct inserted into inode's list *after* the last descriptor
had been gone from current->files.

Since that's the only point where dnotify_struct gets evicted, we are
screwed - it will stick around indefinitely.  Even after struct file in
question is gone and freed.  Worse, we can trigger send_sigio() on it at
any later point, which allows to send an arbitrary signal to arbitrary
process if we manage to apply enough memory pressure to get the page
that used to host that struct file and fill it with the right pattern...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agox86: Mark OPTIMIZE_INLINING broken
Linus Torvalds [Thu, 1 May 2008 02:50:03 +0000]
x86: Mark OPTIMIZE_INLINING broken

So Ingo finally did figure out why UML broke with this option: UML
passes gcc the -fno-unit-at-a-time flag, and apparently that wreaks
havoc with gcc's inlining.

We could turn off -fno-unit-at-a-time for UML for gcc4+ (which is what
x86 does), but there's bad blood about this whole option, and it does
show that the thing is just fragile as heck.

So let tempers cool, and disable the thing, and we can revisit the
decision later.

Cc: Adrian Bunk <bunk@kernel.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux...
Linus Torvalds [Thu, 1 May 2008 02:31:52 +0000]
Merge branch 'for-linus' of git://git./linux/kernel/git/x86/linux-2.6-x86-fixes3

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes3: (21 commits)
  x86: numaq fix
  x86: 8K stacks by default
  x86: ioremap ram check fix
  x86: fix HT cpu booting on 32-bit
  x86: optimize inlining off
  x86: CONFIG_X86_ELAN fix
  x86: Kconfig fix
  x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()
  x86: use defconfigs from x86/configs/*
  toshiba: use ioremap_cached
  revert: "x86: ioremap(), extend check to all RAM pages"
  x86: don't bother printing compat vdso address
  fix: x86: support for new UV apic
  x86: fix early-BUG message
  x86: iommu_sac_force can become static
  x86: add proper header for reboot_force
  x86 VISWS: build fix
  x86, voyager: fix ioremap_nocache()
  hpet: fix
  x86: unexport kmap_atomic_to_page
  ...

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6
Linus Torvalds [Thu, 1 May 2008 00:05:21 +0000]
Merge git://git./linux/kernel/git/gregkh/driver-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  klist: fix coding style errors in klist.h and klist.c
  driver core: remove no longer used "struct class_device"
  pcmcia: remove pccard_sysfs_interface warnings
  devres: support addresses greater than an unsigned long via dev_ioremap
  kobject: do not copy vargs, just pass them around
  sysfs: sysfs_update_group stub for CONFIG_SYSFS=n
  DEBUGFS: Correct location of debugfs API documentation.
  driver core: warn about duplicate driver names on the same bus
  klist: implement klist_add_{after|before}()
  klist: implement KLIST_INIT() and DEFINE_KLIST()
  sysfs: Disallow truncation of files in sysfs

11 years agoklist: fix coding style errors in klist.h and klist.c
Greg Kroah-Hartman [Wed, 30 Apr 2008 23:43:45 +0000]
klist: fix coding style errors in klist.h and klist.c

Finally clean up the odd spacing in these files.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agodriver core: remove no longer used "struct class_device"
Kay Sievers [Wed, 12 Mar 2008 19:47:35 +0000]
driver core: remove no longer used "struct class_device"

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agopcmcia: remove pccard_sysfs_interface warnings
David Brownell [Mon, 28 Apr 2008 08:03:20 +0000]
pcmcia: remove pccard_sysfs_interface warnings

Make the PCMCIA core stop using class_interface to hide socket attribute
registration.  This removes the associated section mismatch warnings, and
helps get to the point where that mechanism can finally be removed.

Simplify that attribute registration by using an attribute_group.
This is a net shrink in object size.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agodevres: support addresses greater than an unsigned long via dev_ioremap
Kumar Gala [Tue, 29 Apr 2008 15:25:48 +0000]
devres: support addresses greater than an unsigned long via dev_ioremap

Use a resource_size_t instead of unsigned long since some arch's are
capable of having ioremap deal with addresses greater than the size of a
unsigned long.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agokobject: do not copy vargs, just pass them around
Kay Sievers [Wed, 30 Apr 2008 00:06:29 +0000]
kobject: do not copy vargs, just pass them around

This prevents a few unneeded copies.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agosysfs: sysfs_update_group stub for CONFIG_SYSFS=n
Randy Dunlap [Wed, 30 Apr 2008 16:01:17 +0000]
sysfs: sysfs_update_group stub for CONFIG_SYSFS=n

scsi_transport_spi uses sysfs_update_group() when CONFIG_SYSFS=n,
so provide a stub for it.

next-20080423/drivers/scsi/scsi_transport_spi.c:1467: error: implicit declaration of function 'sysfs_update_group'
make[3]: *** [drivers/scsi/scsi_transport_spi.o] Error 1

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agoDEBUGFS: Correct location of debugfs API documentation.
Robert P. J. Day [Fri, 25 Apr 2008 12:52:51 +0000]
DEBUGFS: Correct location of debugfs API documentation.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agodriver core: warn about duplicate driver names on the same bus
Stas Sergeev [Sat, 26 Apr 2008 15:52:35 +0000]
driver core: warn about duplicate driver names on the same bus

Currently an attempt to register multiple
drivers with the same name causes the
stack trace with some cryptic error message.
The attached patch adds the necessary check
and the clear error message.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agoklist: implement klist_add_{after|before}()
Tejun Heo [Tue, 22 Apr 2008 09:58:46 +0000]
klist: implement klist_add_{after|before}()

Add klist_add_after() and klist_add_before() which puts a new node
after and before an existing node, respectively.  This is useful for
callers which need to keep klist ordered.  Note that synchronizing
between simultaneous additions for ordering is the caller's
responsibility.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agoklist: implement KLIST_INIT() and DEFINE_KLIST()
Tejun Heo [Fri, 25 Apr 2008 18:16:04 +0000]
klist: implement KLIST_INIT() and DEFINE_KLIST()

klist is missing static initializers and definition helper.  Add them.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agosysfs: Disallow truncation of files in sysfs
Ben Hutchings [Mon, 28 Apr 2008 14:59:58 +0000]
sysfs: Disallow truncation of files in sysfs

sysfs allows attribute files to be truncated, e.g. using ftruncate(), with the
expected effect on their inode.   For most attributes, this doesn't change the
"real" size of the file i.e. how much can be read from it.  However, the
parameter validation for reading and writing binary attribute files is based
on the inode size and not the size specified in the file's bin_attribute, so it
can be broken by this. For example, if we try using dd to write to such a file:

# pwd
/sys/bus/pci/devices/0000:08:00.0
# ls -l config
-rw-r--r--  1 root root 4096 Feb  1 17:35 config
# dd if=/dev/zero of=config bs=4 count=1
1+0 records in
1+0 records out
# ls -l config
-rw-r--r--  1 root root 0 Feb  1 17:50 config
# dd if=/dev/zero of=config bs=4 count=1 seek=128
dd: writing `config': No space left on device
1+0 records in
0+0 records out

Also, after truncation to 0, parameter validation for read and write is
disabled.  Most bin_attribute read and write methods also validate the size and
offset, but for some this will allow out-of-range access.  This may be a
security issue, though access to such files is often limited to root.  In any
case, the validation should remain for safety's sake!)

This was previously reported in Bugzilla as bug 9867.

sysfs should ignore size changes or else refuse them (by returning -EINVAL).
This patch makes it ignore them.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

11 years agoFix ACPI vs proc_create_data() mismerge
Alexey Dobriyan [Thu, 1 May 2008 00:10:02 +0000]
Fix ACPI vs proc_create_data() mismerge

acpi_device_dir() is NULL until all files are createst, so everyting is
created in straight in /proc/ and creation code warns.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoipv6: Compilation fix for compat MCAST_MSFILTER sockopts.
Pavel Emelyanov [Wed, 30 Apr 2008 21:49:54 +0000]
ipv6: Compilation fix for compat MCAST_MSFILTER sockopts.

The last hunk from the commit dae50295 (ipv4/ipv6 compat: Fix SSM
applications on 64bit kernels.) escaped from the compat_ipv6_setsockopt
to the ipv6_getsockopt (I guess due to patch smartness wrt searching
for context) thus breaking 32-bit and 64-bit-without-compat compilation.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

11 years agox86: numaq fix
Ingo Molnar [Wed, 30 Apr 2008 21:05:52 +0000]
x86: numaq fix

do not override the existing pci-y rule when adding visws or
numaq rules.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

11 years agox86: 8K stacks by default
Ingo Molnar [Wed, 30 Apr 2008 18:45:40 +0000]
x86: 8K stacks by default

Switch back to 8K stacks as the safer default. Out-of-memory
situations are less problematic than silent and hard to debug
stack corruption.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: ioremap ram check fix
Andres Salomon [Wed, 30 Apr 2008 15:30:24 +0000]
x86: ioremap ram check fix

bdd3cee2e4b7279457139058615ced6c2b41e7de (x86: ioremap(), extend check
to all RAM pages) breaks OLPC's ioremap call.  The ioremap that OLPC uses is:

        romsig = ioremap(0xffffffc0, 16);

The commit that breaks it is basically:

-       for (pfn = phys_addr >> PAGE_SHIFT; pfn < max_pfn_mapped &&
-            (pfn << PAGE_SHIFT) < last_addr; pfn++) {
+       for (pfn = phys_addr >> PAGE_SHIFT;
+                               (pfn << PAGE_SHIFT) < last_addr; pfn++) {
+

Previously, the 'pfn < max_pfn_mapped' check would've caused us to not
enter the loop.  Removing that check means we loop infinitely.  The
reason for that is because pfn is 0xfffff, and last_addr is 0xffffffcf.
The remaining check that is used to exit the loop is not sufficient;
when pfn<<PAGE_SHIFT is 0xfffff000, that is less than 0xffffffcf; when
we increment pfn and it overflows (pfn == 0x100000), pfn<<PAGE_SHIFT
ends up being 0.  That, of course, is less than last_addr.  In effect,
pfn<<PAGE_SHIFT is never lower than last_addr.

The simple fix for this is to limit the last_addr check to the PAGE_MASK;
a patch is below.

Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: fix HT cpu booting on 32-bit
Hugh Dickins [Wed, 30 Apr 2008 15:17:46 +0000]
x86: fix HT cpu booting on 32-bit

Since recent smpboot 32/64-bit merge, my dual Xeon with HT has been
booting only 2 of its 4 cpus (when running an i386 kernel; but x86_64
is okay).  J.A. Magallón reports the same.

 native_cpu_up: bad cpu 2
 native_cpu_up: bad cpu 3

The mach-default cpu_present_to_apicid() was just returning cpu number
(2, 3) instead of apicid (6, 7): looks like we now need the x86_64 code
even for the i386 case.

Comparing with other versions of cpu_present_to_apicid(), it seems a
good idea to include an NR_CPUS test too, since cpu_present() doesn't
include that; but that wasn't a problem here, and may no problem at all.

Prior to that smpboot merge, my Xeon booted the two HT siblings on one
physical first, then the two siblings on the other physical after - when
i386, but alternated them when x86_64.  Since the merge, the x86_64
sequence is unchanged, but the i386 sequence is now like x86_64.

I prefer this consistency, and I prefer the new sequence: booting with
maxcpus=2 then uses the independent physicals without HT sharing.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: optimize inlining off
Ingo Molnar [Wed, 30 Apr 2008 08:29:13 +0000]
x86: optimize inlining off

default to inline optimizing off.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: CONFIG_X86_ELAN fix
Ingo Molnar [Wed, 30 Apr 2008 06:58:27 +0000]
x86: CONFIG_X86_ELAN fix

move the X86_CPU section out of the !X86_ELAN branch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: Kconfig fix
Ingo Molnar [Wed, 30 Apr 2008 06:48:45 +0000]
x86: Kconfig fix

Andrew noticed that OPTIMIZE_INLINING appeared in the toplevel
menu - fix it.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache...
Suresh Siddha [Sat, 26 Apr 2008 00:07:22 +0000]
x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()

Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
Once all the X drivers move to ioremap_wc(), we can go back to strong
UC semantics for ioremap() and ioremap_nocache().

To avoid attribute aliasing issues, pci_mmap_page_range() will also
use UC_MINUS for default non write-combining mapping request.

Next steps:
a) change all the video drivers using ioremap() or ioremap_nocache()
   and adding WC MTTR using mttr_add() to ioremap_wc()

b) for strict usage, we can go back to strong uc semantics
   for ioremap() and ioremap_nocache() after some grace period for
   completing step-a.

c) user level X server needs to use the appropriate method for setting
   up WC mapping (like using resourceX_wc sysfs file instead of
   adding MTRR for WC and using /dev/mem or resourceX under /sys)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: use defconfigs from x86/configs/*
Sam Ravnborg [Tue, 29 Apr 2008 10:48:15 +0000]
x86: use defconfigs from x86/configs/*

Daniel Drake <dsd@gentoo.org> reported:

In 2.6.23, if you unpacked a kernel source tarball and then
ran "make menuconfig" you'd be presented with this message:
    # using defaults found in arch/i386/defconfig

and the default options would be set.

The same thing in 2.6.24 does not give you any "using defaults" message, and
the default config options within menuconfig are rather blank (e.g. no PCI
support). You can work around this by explicitly running "make defconfig"
before menuconfig, but it would be nice to have the behaviour the way it was
for 2.6.23 (and the way it still is for other archs).

Fixed by adding a x86 specific defconfig list to Kconfig.

Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=10470
Tested-by: dsd@gentoo.org
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agotoshiba: use ioremap_cached
Alan Cox [Tue, 29 Apr 2008 13:20:23 +0000]
toshiba: use ioremap_cached

The switch of ioremap to default to uncached doesn't break this driver
but it does needlessly slow it down as BIOS space is cachable and this
driver is quite happy scanning cached ROM space.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agorevert: "x86: ioremap(), extend check to all RAM pages"
Ingo Molnar [Tue, 29 Apr 2008 10:04:51 +0000]
revert: "x86: ioremap(), extend check to all RAM pages"

Vegard Nossum reported a large (150 seconds) boot delay during bootup,
and bisected it to "x86: ioremap(), extend check to all RAM pages"
(commit bdd3cee2e4b). Revert this commit for now.

Bisected-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: don't bother printing compat vdso address
Jeremy Fitzhardinge [Mon, 28 Apr 2008 18:05:07 +0000]
x86: don't bother printing compat vdso address

The kernel prints the compat vdso address regardless of whether compat
vdso mode is enabled or not, which is confusing.  Given that this
isn't very interesting information anyway, just remove the printk.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Gerhard Mack <gmack@innerfire.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agofix: x86: support for new UV apic
Andi Kleen [Fri, 25 Apr 2008 09:45:26 +0000]
fix: x86: support for new UV apic

Don't warn in read_apic_id() when preemptible but only one CPU online.

Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: fix early-BUG message
Vegard Nossum [Fri, 25 Apr 2008 19:02:34 +0000]
x86: fix early-BUG message

The .asciz directive takes any number of strings, but each one is zero-
terminated, and string pasting is not done as in C. That results in only the
first line being output.

Replace .asciz with multiple .ascii directives and terminate with .asciz.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: iommu_sac_force can become static
Dmitri Vorobiev [Sun, 27 Apr 2008 23:15:58 +0000]
x86: iommu_sac_force can become static

The iommu_sac_force variable is needlessly defined global,
and this patch makes it static. Additionally, this variable
needs not be explicitly initialized.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: add proper header for reboot_force
Dmitri Vorobiev [Sun, 27 Apr 2008 23:15:59 +0000]
x86: add proper header for reboot_force

This patch fixes one sparse warning by including the appropriate
header for the reboot_force symbol.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86 VISWS: build fix
Ingo Molnar [Mon, 28 Apr 2008 08:46:58 +0000]
x86 VISWS: build fix

the 'reboot_force' flag is a notion that non-PC subarchitectures do
not have.

also, unify the X86_BIOS_REBOOT option between 32-bit and 64-bit
and get rid of a few unnecessary Kconfig and Makefile complications
that way.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86, voyager: fix ioremap_nocache()
Ingo Molnar [Sun, 27 Apr 2008 21:21:03 +0000]
x86, voyager: fix ioremap_nocache()

James Bottomley reported that the following commit:

| commit 6371b495991debfd1417b17c2bc4f7d7bae05739
| Author: Ingo Molnar <mingo@elte.hu>
| Date:   Wed Jan 30 13:33:40 2008 +0100
|
|     x86: change ioremap() to default to uncached

broke Voyager.

James says:

" it broke a class of voyager machines: those which
  rely on the quad interrupt controller (QIC).  The precis of why they
  broke is because the QIC does IPIs (or CPIs in its terminology) via
  cache line interference: you interrupt a processor by moving a
  designated memory area to write exclusive in the cache (by simply
  writing to the line) and the CPU acks the interrupt by moving it back to
  read shared (by reading from it).  That area, is, of course, mapped by
  ioremap, so reversing the ioremap semantics and adding the uncached bit
  completely breaks the QIC. "

Sorry about that!

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agohpet: fix
Ingo Molnar [Sun, 27 Apr 2008 12:04:14 +0000]
hpet: fix

Al Viro pointed out that there's a missing readl() of timer->hpet_config,
found by Sparse.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: unexport kmap_atomic_to_page
Adrian Bunk [Mon, 21 Apr 2008 08:51:44 +0000]
x86: unexport kmap_atomic_to_page

This patch removes the no longer used export of kmap_atomic_to_page.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agox86: remove Xgt_desc_struct
Adrian Bunk [Mon, 21 Apr 2008 08:47:46 +0000]
x86: remove Xgt_desc_struct

The comment says it should have been removed in 2.6.25.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

11 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux...
Linus Torvalds [Wed, 30 Apr 2008 18:52:52 +0000]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (179 commits)
  ACPI: Fix acpi_processor_idle and idle= boot parameters interaction
  acpi: fix section mismatch warning in pnpacpi
  intel_menlo: fix build warning
  ACPI: Cleanup: Remove unneeded, multiple local dummy variables
  ACPI: video - fix permissions on some proc entries
  ACPI: video - properly handle errors when registering proc elements
  ACPI: video - do not store invalid entries in attached_array list
  ACPI: re-name acpi_pm_ops to acpi_suspend_ops
  ACER_WMI/ASUS_LAPTOP: fix build bug
  thinkpad_acpi: fix possible NULL pointer dereference if kstrdup failed
  ACPI: check a return value correctly in acpi_power_get_context()
  #if 0 acpi/bay.c:eject_removable_drive()
  eeepc-laptop: add hwmon fan control
  eeepc-laptop: add backlight
  eeepc-laptop: add base driver
  ACPI: thinkpad-acpi: bump up version to 0.20
  ACPI: thinkpad-acpi: fix selects in Kconfig
  ACPI: thinkpad-acpi: use a private workqueue
  ACPI: thinkpad-acpi: fluff really minor fix
  ACPI: thinkpad-acpi: use uppercase for "LED" on user documentation
  ...

Fixed conflicts in drivers/acpi/video.c and drivers/misc/intel_menlow.c
manually.

11 years agoMerge branch 'pnp' into release
Len Brown [Wed, 30 Apr 2008 17:59:05 +0000]
Merge branch 'pnp' into release

11 years agoMerge branches 'release', 'acpica', 'bugzilla-10224', 'bugzilla-9772', 'bugzilla...
Len Brown [Wed, 30 Apr 2008 17:58:00 +0000]
Merge branches 'release', 'acpica', 'bugzilla-10224', 'bugzilla-9772', 'bugzilla-9916', 'ec', 'eeepc', 'idle', 'misc', 'pm-legacy', 'sysfs-links-2.6.26', 'thermal', 'thinkpad' and 'video' into release

11 years agoACPI: Fix acpi_processor_idle and idle= boot parameters interaction
Venkatesh Pallipadi [Wed, 30 Apr 2008 17:57:15 +0000]
ACPI: Fix acpi_processor_idle and idle= boot parameters interaction

acpi_processor_idle and "idle=" boot parameter interaction is broken.
The problem is that, at boot time acpi driver is checking for "idle=" boot
option and not registering the acpi idle handler. But, when there is a CST
changed callback (typically when switching AC <-> battery or suspend-resume)
there are no checks for boot_option_idle_override and acpi idle handler tries
to get installed with nasty side effects.

With CPU_IDLE configured this issue causes results in a nasty oops on CST
change callback and without CPU_IDLE there is no oops, but boot option
of "idle=" gets ignored and acpi idle handler gets installed.

Change the behavior to not do anything in acpi idle handler when there is a
"idle=" boot option.

Note that the problem is only there when "idle=" boot option is used.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

11 years agoacpi: fix section mismatch warning in pnpacpi
Sam Ravnborg [Tue, 29 Apr 2008 20:52:01 +0000]
acpi: fix section mismatch warning in pnpacpi

Fix following section mismatch warning:
WARNING: vmlinux.o(.text+0x153d69): Section mismatch in reference from the function is_exclusive_device() to the variable .init.data:excluded_id_list

is_exclusive_device is only used from __init context so document
this with the __init annotation and get rid of the warning.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Len Brown <len.brown@intel.com>

11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Wed, 30 Apr 2008 16:22:27 +0000]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  [ALSA] soc - neo1973_wm8753.c add suspend and shutdown hooks for lm4857 chip
  [ALSA] soc - neo1973_wm8753.c change maintainer contact info
  [ALSA] soc - neo1973_wm8753.c cleanup checkpatch issues
  [ALSA] soc - ln2440sbc_alc650 - Fix checkpatch warnings
  [ALSA] soc - s3c24xx-pcm - Fix checkpatch warnings
  [ALSA] soc - s3c2443-ac97 - Fix checkpatch warnings
  [ALSA] soc - wm8753 - Clean up checkpatch warnings

11 years ago[ALSA] soc - neo1973_wm8753.c add suspend and shutdown hooks for lm4857 chip
Graeme Gregory [Wed, 30 Apr 2008 18:26:45 +0000]
[ALSA] soc - neo1973_wm8753.c add suspend and shutdown hooks for lm4857 chip

Patch taken from the openmoko bugtracker
http://bugzilla.openmoko.org/cgi-bin/bugzilla/show_bug.cgi?id=781

This patch adds Suspend/Resume and Shutdown support for the lm4857 to
the driver.

Signed-off-by: Graeme Gregory <graeme@openmoko.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

11 years ago[ALSA] soc - neo1973_wm8753.c change maintainer contact info
Graeme Gregory [Wed, 30 Apr 2008 18:25:23 +0000]
[ALSA] soc - neo1973_wm8753.c change maintainer contact info

I have moved workplaces since I originally wrote this driver so update
the contact info for new employers.

Signed-off-by: Graeme Gregory <graeme@openmoko.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

11 years ago[ALSA] soc - neo1973_wm8753.c cleanup checkpatch issues
Graeme Gregory [Wed, 30 Apr 2008 18:24:54 +0000]
[ALSA] soc - neo1973_wm8753.c cleanup checkpatch issues

Clean up a few issues with the file that checkpatch noted, no functionality
changes.

Signed-off-by: Graeme Gregory <graeme@openmoko.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

11 years ago[ALSA] soc - ln2440sbc_alc650 - Fix checkpatch warnings
Mark Brown [Wed, 30 Apr 2008 15:19:57 +0000]
[ALSA] soc - ln2440sbc_alc650 - Fix checkpatch warnings

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

11 years ago[ALSA] soc - s3c24xx-pcm - Fix checkpatch warnings
Mark Brown [Wed, 30 Apr 2008 15:19:32 +0000]
[ALSA] soc - s3c24xx-pcm - Fix checkpatch warnings

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

11 years ago[ALSA] soc - s3c2443-ac97 - Fix checkpatch warnings
Mark Brown [Wed, 30 Apr 2008 15:19:07 +0000]
[ALSA] soc - s3c2443-ac97 - Fix checkpatch warnings

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

11 years ago[ALSA] soc - wm8753 - Clean up checkpatch warnings
Mark Brown [Wed, 30 Apr 2008 15:18:43 +0000]
[ALSA] soc - wm8753 - Clean up checkpatch warnings

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Wed, 30 Apr 2008 15:46:16 +0000]
Merge git://git./linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: remove duplicated include
  sparc: Add kgdb support.
  kgdbts: Sparc needs sstep emulation.
  sparc32: Kill smp_message_pass() and related code.
  sparc64: Kill PIL_RESERVED, unused.
  sparc64: Split entry.S up into seperate files.

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Wed, 30 Apr 2008 15:45:48 +0000]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (53 commits)
  tcp: Overflow bug in Vegas
  [IPv4] UFO: prevent generation of chained skb destined to UFO device
  iwlwifi: move the selects to the tristate drivers
  ipv4: annotate a few functions __init in ipconfig.c
  atm: ambassador: vcc_sf semaphore to mutex
  MAINTAINERS: The socketcan-core list is subscribers-only.
  netfilter: nf_conntrack: padding breaks conntrack hash on ARM
  ipv4: Update MTU to all related cache entries in ip_rt_frag_needed()
  sch_sfq: use del_timer_sync() in sfq_destroy()
  net: Add compat support for getsockopt (MCAST_MSFILTER)
  net: Several cleanups for the setsockopt compat support.
  ipvs: fix oops in backup for fwmark conn templates
  bridge: kernel panic when unloading bridge module
  bridge: fix error handling in br_add_if()
  netfilter: {nfnetlink,ip,ip6}_queue: fix skb_over_panic when enlarging packets
  netfilter: x_tables: fix net namespace leak when reading /proc/net/xxx_tables_names
  netfilter: xt_TCPOPTSTRIP: signed tcphoff for ipv6_skip_exthdr() retval
  tcp: Limit cwnd growth when deferring for GSO
  tcp: Allow send-limited cwnd to grow up to max_burst when gso disabled
  [netdrvr] gianfar: Determine TBIPA value dynamically
  ...

11 years agoinlining: do not allow gcc below version 4 to optimize inlining
Ingo Molnar [Tue, 29 Apr 2008 22:15:31 +0000]
inlining: do not allow gcc below version 4 to optimize inlining

fix the condition to match intention: always use the old inlining
behavior on all gcc versions below 4.

this should solve the UML build problem.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoUpdate .mailmap
S.Çağlar Onur [Wed, 30 Apr 2008 12:29:02 +0000]
Update .mailmap

I realize some of the maintainers email clients and/or scripts cannot
handle UTF-8 encoded names properly, as a result your ChangeLogs
displays me as two different person :).

Following patch adds correctly encoded name of mine into .mailmap, to
prevent appearing it not to be so or badly displayed.

Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoMerge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Wed, 30 Apr 2008 15:38:30 +0000]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] Update default configuration.
  [S390] use generic sys_ptrace
  [S390] Remove self ptrace IEEE_IP hack.
  [S390] Convert to SPARSEMEM & SPARSEMEM_VMEMMAP
  [S390] System z large page support.
  [S390] Convert machine feature detection code to C.
  [S390] vmemmap: use clear_table to initialise page tables.
  [S390] Move stfl to system.h and delete duplicated version.
  [S390] uaccess_mvcos: #ifdef config dependent code.
  [S390] cpu topology: Fix possible deadlock.
  [S390] Add topology_core_siblings to topology.h
  [S390] cio: Make isc handling more robust.
  [S390] remove -traditional
  [S390] Automatically detect added cpus.
  [S390] smp: Fix locking order.
  [S390] Add missing ifndef/define to include/asm-s390/sysinfo.h.
  [S390] Move show_regs to traps.c.
  [S390] cio: Use strict_strtoul() for attributes.

11 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
Linus Torvalds [Wed, 30 Apr 2008 15:37:40 +0000]
Merge branch 'master' of git://git./linux/kernel/git/paulus/powerpc

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  [POWERPC] Fix crashkernel= handling when no crashkernel= specified
  [POWERPC] Make emergency stack safe for current_thread_info() use
  [POWERPC] spufs: add .gitignore for spu_save_dump.h & spu_restore_dump.h
  [POWERPC] spufs: trace spu_acquire_saved events
  [POWERPC] spufs: fix marker name for find_victim
  [POWERPC] spufs: add marker for destroy_spu_context
  [POWERPC] spufs: add sputrace marker parameter names
  [POWERPC] spufs: add context switch notification log
  [POWERPC] mpc5200: defconfigs for CM5200, Lite5200B, Motion-PRO and TQM5200
  [POWERPC] mpc5200: Switch mpc5200 dts files to dts-v1 format
  [POWERPC] mpc5200: Fix FEC error handling on FIFO errors
  [POWERPC] mpc5200: add Phytec pcm030 board support
  [POWERPC] mpc5200: add gpiolib support for mpc5200
  [POWERPC] mpc5200: add interrupt type function
  [POWERPC] mpc5200: Fix unterminated of_device_id table

11 years agofix drivers/media/common/tuners/ build bug
Ingo Molnar [Wed, 30 Apr 2008 09:50:11 +0000]
fix drivers/media/common/tuners/ build bug

x86.git randconfig testing found a build failure on latest -git:

 drivers/built-in.o: In function `set_type':
 tuner-core.c:(.text+0x2a9a26): undefined reference to `tea5761_attach'
 tuner-core.c:(.text+0x2a9d05): undefined reference to `tda9887_attach'
 tuner-core.c:(.text+0x2a9d51): undefined reference to `xc2028_attach'
 tuner-core.c:(.text+0x2a9e22): undefined reference to `tda829x_attach'
 tuner-core.c:(.text+0x2a9e3f): undefined reference to `microtune_attach'
 drivers/built-in.o: In function `tuner_probe':
 tuner-core.c:(.text+0x2aa18a): undefined reference to `tda829x_probe'
 tuner-core.c:(.text+0x2aa302): undefined reference to `tea5761_autodetection'

with the following config:

 http://redhat.com/~mingo/misc/config-Wed_Apr_30_10_21_40_CEST_2008.bad

the problem is caused by the drivers/media/common/tuners/ subdirectory
not being part of the kbuild hierarchy anymore, due to commit
7c91f0624 ("V4L/DVB(7767): Move tuners to common/tuners").

this seems similar to the problem also reported by Mike Galbraith.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agorevert "memory hotplug: allocate usemap on the section with pgdat"
Andrew Morton [Wed, 30 Apr 2008 07:55:17 +0000]
revert "memory hotplug: allocate usemap on the section with pgdat"

This:

commit 86f6dae1377523689bd8468fed2f2dd180fc0560
Author: Yasunori Goto <y-goto@jp.fujitsu.com>
Date:   Mon Apr 28 02:13:33 2008 -0700

    memory hotplug: allocate usemap on the section with pgdat

    Usemaps are allocated on the section which has pgdat by this.

    Because usemap size is very small, many other sections usemaps are allocated
    on only one page.  If a section has usemap, it can't be removed until removing
    other sections.  This dependency is not desirable for memory removing.

    Pgdat has similar feature.  When a section has pgdat area, it must be the last
    section for removing on the node.  So, if section A has pgdat and section B
    has usemap for section A, Both sections can't be removed due to dependency
    each other.

    To solve this issue, this patch collects usemap on same section with pgdat.
    If other sections doesn't have any dependency, this section will be able to be
    removed finally.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

broke davem's sparc64 bootup.  Revert it while we work out what went wrong.

Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agomm: fix warning on memory offline
Nick Piggin [Wed, 30 Apr 2008 07:55:16 +0000]
mm: fix warning on memory offline

KAMEZAWA Hiroyuki found a warning message in the buffer dirtying code that
is coming from page migration caller.

WARNING: at fs/buffer.c:720 __set_page_dirty+0x330/0x360()
Call Trace:
 [<a000000100015220>] show_stack+0x80/0xa0
 [<a000000100015270>] dump_stack+0x30/0x60
 [<a000000100089ed0>] warn_on_slowpath+0x90/0xe0
 [<a0000001001f8b10>] __set_page_dirty+0x330/0x360
 [<a0000001001ffb90>] __set_page_dirty_buffers+0xd0/0x280
 [<a00000010012fec0>] set_page_dirty+0xc0/0x260
 [<a000000100195670>] migrate_page_copy+0x5d0/0x5e0
 [<a000000100197840>] buffer_migrate_page+0x2e0/0x3c0
 [<a000000100195eb0>] migrate_pages+0x770/0xe00

What was happening is that migrate_page_copy wants to transfer the PG_dirty
bit from old page to new page, so what it would do is set_page_dirty(newpage).
However set_page_dirty() is used to set the entire page dirty, wheras in
this case, only part of the page was dirty, and it also was not uptodate.

Marking the whole page dirty with set_page_dirty would lead to corruption or
unresolvable conditions -- a dirty && !uptodate page and dirty && !uptodate
buffers.

Possibly we could just ClearPageDirty(oldpage); SetPageDirty(newpage);
however in the interests of keeping the change minimal...

Signed-off-by: Nick Piggin <npiggin@suse.de>
Tested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoDrop the exporting of empty <linux/byteorder/generic.h>
Robert P. J. Day [Wed, 30 Apr 2008 07:55:14 +0000]
Drop the exporting of empty <linux/byteorder/generic.h>

Fix up the contents of <linux/byteorder/> so that it doesn't export a
content-free generic.h to user space.  This involves:

* Removing the __KERNEL__ tests from generic.h and dropping it from
  Kbuild.
* Wrapping the inclusions of generic.h in both big_endian.h and
  little_endian.h in __KERNEL__ tests.
* Shifting big_endian.h and little_endian.h from header-y to
  unifdef-y in Kbuild.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoremove __KERNEL__ tests of unexported headers under asm-generic/
Robert P. J. Day [Wed, 30 Apr 2008 07:55:13 +0000]
remove __KERNEL__ tests of unexported headers under asm-generic/

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoRemove "#ifdef __KERNEL__" checks from unexported headers
Robert P. J. Day [Wed, 30 Apr 2008 07:55:12 +0000]
Remove "#ifdef __KERNEL__" checks from unexported headers

Remove the "#ifdef __KERNEL__" tests from unexported header files in
linux/include whose entire contents are wrapped in that preprocessor
test.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoserial: replace remaining __FUNCTION__ occurrences
Harvey Harrison [Wed, 30 Apr 2008 07:55:10 +0000]
serial: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agodrivers/char: replace remaining __FUNCTION__ occurrences
Harvey Harrison [Wed, 30 Apr 2008 07:55:10 +0000]
drivers/char: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agofs: replace remaining __FUNCTION__ occurrences
Harvey Harrison [Wed, 30 Apr 2008 07:55:09 +0000]
fs: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoafs: replace remaining __FUNCTION__ occurrences
Harvey Harrison [Wed, 30 Apr 2008 07:55:09 +0000]
afs: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agolib: replace remaining __FUNCTION__ occurrences
Harvey Harrison [Wed, 30 Apr 2008 07:55:08 +0000]
lib: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agokernel: replace remaining __FUNCTION__ occurrences
Harvey Harrison [Wed, 30 Apr 2008 07:55:08 +0000]
kernel: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agomm: remove remaining __FUNCTION__ occurrences
Harvey Harrison [Wed, 30 Apr 2008 07:55:07 +0000]
mm: remove remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agobrd: modify ramdisk device to be able to manage partitions
Laurent Vivier [Wed, 30 Apr 2008 07:55:06 +0000]
brd: modify ramdisk device to be able to manage partitions

This patch adds partition management for Block RAM Device (BRD).

This patch is done to keep in sync BRD and loop device drivers.

This patch adds a parameter to the module, max_part, to specify
the maximum number of partitions per RAM device.

Example:

# modprobe brd max_part=63
# ls -l /dev/ram*
brw-rw---- 1 root disk 1,   0 2008-04-03 13:39 /dev/ram0
brw-rw---- 1 root disk 1,  64 2008-04-03 13:39 /dev/ram1
brw-rw---- 1 root disk 1, 640 2008-04-03 13:39 /dev/ram10
brw-rw---- 1 root disk 1, 704 2008-04-03 13:39 /dev/ram11
brw-rw---- 1 root disk 1, 768 2008-04-03 13:39 /dev/ram12
brw-rw---- 1 root disk 1, 832 2008-04-03 13:39 /dev/ram13
brw-rw---- 1 root disk 1, 896 2008-04-03 13:39 /dev/ram14
brw-rw---- 1 root disk 1, 960 2008-04-03 13:39 /dev/ram15
brw-rw---- 1 root disk 1, 128 2008-04-03 13:39 /dev/ram2
brw-rw---- 1 root disk 1, 192 2008-04-03 13:39 /dev/ram3
brw-rw---- 1 root disk 1, 256 2008-04-03 13:39 /dev/ram4
brw-rw---- 1 root disk 1, 320 2008-04-03 13:39 /dev/ram5
brw-rw---- 1 root disk 1, 384 2008-04-03 13:39 /dev/ram6
brw-rw---- 1 root disk 1, 448 2008-04-03 13:39 /dev/ram7
brw-rw---- 1 root disk 1, 512 2008-04-03 13:39 /dev/ram8
brw-rw---- 1 root disk 1, 576 2008-04-03 13:39 /dev/ram9
# fdisk /dev/ram0
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): o
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-2, default 1): 1
Last cylinder or +size or +sizeM or +sizeK (1-2, default 2): 2

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
# ls -l /dev/ram0*
brw-rw---- 1 root disk 1, 0 2008-04-03 13:40 /dev/ram0
brw-rw---- 1 root disk 1, 1 2008-04-03 13:40 /dev/ram0p1
# mkfs /dev/ram0p1
mke2fs 1.40-WIP (14-Nov-2006)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
4016 inodes, 16032 blocks
801 blocks (5.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=16515072
2 block groups
8192 blocks per group, 8192 fragments per group
2008 inodes per group
Superblock backups stored on blocks:
8193

Writing inode tables: done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
# mount /dev/ram0p1 /mnt
df /mnt
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/ram0p1              15521       138     14582   1% /mnt
# ls -l /mnt
total 12
drwx------ 2 root root 12288 2008-04-03 13:41 lost+found
# umount /mnt
# rmmod brd

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoadd hrtimer specific debugobjects code
Thomas Gleixner [Wed, 30 Apr 2008 07:55:04 +0000]
add hrtimer specific debugobjects code

hrtimers have now dynamic users in the network code.  Put them under
debugobjects surveillance as well.

Add calls to the generic object debugging infrastructure and provide fixup
functions which allow to keep the system alive when recoverable problems have
been detected by the object debugging core code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agodebugobjects: add timer specific object debugging code
Thomas Gleixner [Wed, 30 Apr 2008 07:55:03 +0000]
debugobjects: add timer specific object debugging code

Add calls to the generic object debugging infrastructure and provide fixup
functions which allow to keep the system alive when recoverable problems have
been detected by the object debugging core code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agodebugobjects: add documentation
Thomas Gleixner [Wed, 30 Apr 2008 07:55:02 +0000]
debugobjects: add documentation

Add a DocBook for debugobjects.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoinfrastructure to debug (dynamic) objects
Thomas Gleixner [Wed, 30 Apr 2008 07:55:01 +0000]
infrastructure to debug (dynamic) objects

We can see an ever repeating problem pattern with objects of any kind in the
kernel:

1) freeing of active objects
2) reinitialization of active objects

Both problems can be hard to debug because the crash happens at a point where
we have no chance to decode the root cause anymore.  One problem spot are
kernel timers, where the detection of the problem often happens in interrupt
context and usually causes the machine to panic.

While working on a timer related bug report I had to hack specialized code
into the timer subsystem to get a reasonable hint for the root cause.  This
debug hack was fine for temporary use, but far from a mergeable solution due
to the intrusiveness into the timer code.

The code further lacked the ability to detect and report the root cause
instantly and keep the system operational.

Keeping the system operational is important to get hold of the debug
information without special debugging aids like serial consoles and special
knowledge of the bug reporter.

The problems described above are not restricted to timers, but timers tend to
expose it usually in a full system crash.  Other objects are less explosive,
but the symptoms caused by such mistakes can be even harder to debug.

Instead of creating specialized debugging code for the timer subsystem a
generic infrastructure is created which allows developers to verify their code
and provides an easy to enable debug facility for users in case of trouble.

The debugobjects core code keeps track of operations on static and dynamic
objects by inserting them into a hashed list and sanity checking them on
object operations and provides additional checks whenever kernel memory is
freed.

The tracked object operations are:
- initializing an object
- adding an object to a subsystem list
- deleting an object from a subsystem list

Each operation is sanity checked before the operation is executed and the
subsystem specific code can provide a fixup function which allows to prevent
the damage of the operation.  When the sanity check triggers a warning message
and a stack trace is printed.

The list of operations can be extended if the need arises.  For now it's
limited to the requirements of the first user (timers).

The core code enqueues the objects into hash buckets.  The hash index is
generated from the address of the object to simplify the lookup for the check
on kfree/vfree.  Each bucket has it's own spinlock to avoid contention on a
global lock.

The debug code can be compiled in without being active.  The runtime overhead
is minimal and could be optimized by asm alternatives.  A kernel command line
option enables the debugging code.

Thanks to Ingo Molnar for review, suggestions and cleanup patches.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoslab: add a flag to prevent debug_free checks on a kmem_cache
Thomas Gleixner [Wed, 30 Apr 2008 07:54:59 +0000]
slab: add a flag to prevent debug_free checks on a kmem_cache

This is a preperatory patch for the debugobjects infrastructure.  The flag
prevents debug_free checks on kmem_caches.  This is necessary to avoid
resursive calls into a debug mechanism which uses a kmem_cache itself.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agodrivers: replace remaining __FUNCTION__ occurrences
Harvey Harrison [Wed, 30 Apr 2008 07:54:57 +0000]
drivers: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoAdd macros similar to min/max/min_t/max_t
Harvey Harrison [Wed, 30 Apr 2008 07:54:55 +0000]
Add macros similar to min/max/min_t/max_t

Also, change the variable names used in the min/max macros to avoid shadowed
variable warnings when min/max min_t/max_t are nested.

Small formatting changes to make all the macros have a similar form.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix v4l build]
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoalloc_uid: cleanup
Andrew Morton [Wed, 30 Apr 2008 07:54:54 +0000]
alloc_uid: cleanup

Use kmem_cache_zalloc(), remove large amounts of initialisation code and
ifdeffery.

Note: this assumes that memset(*atomic_t, 0) correctly initialises the
atomic_t.  This is true for all present archtiectures and if it becomes false
for a future architecture then we'll need to make large changes all over the
place anyway.

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agohfsplus: fix warning with 64k PAGE_SIZE
Andrew Morton [Wed, 30 Apr 2008 07:54:54 +0000]
hfsplus: fix warning with 64k PAGE_SIZE

fs/hfsplus/btree.c: In function 'hfsplus_bmap_alloc':
fs/hfsplus/btree.c:239: warning: comparison is always false due to limited range of data type

But this might hide a real bug?

Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agohfs: fix warning with 64k PAGE_SIZE
Andrew Morton [Wed, 30 Apr 2008 07:54:53 +0000]
hfs: fix warning with 64k PAGE_SIZE

fs/hfs/btree.c: In function 'hfs_bmap_alloc':
fs/hfs/btree.c:263: warning: comparison is always false due to limited range of data type

The patch makes the warning go away, but the code might actually be buggy?

Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoprintk: don't read beyond string arguments' terminating zero
Markus Armbruster [Wed, 30 Apr 2008 07:54:52 +0000]
printk: don't read beyond string arguments' terminating zero

Fix update_console_cmdline() not to to read beyond the terminating zero of its
name argument.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoBasic braille screen reader support
Samuel Thibault [Wed, 30 Apr 2008 07:54:51 +0000]
Basic braille screen reader support

This adds a minimalistic braille screen reader support.  This is meant to
be used by blind people e.g.  on boot failures or when / cannot be mounted
etc and thus the userland screen readers can not work.

[akpm@linux-foundation.org: fix exports]
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Cc: Jiri Kosina <jikos@jikos.cz>
Cc: Dmitry Torokhov <dtor@mail.ru>
Acked-by: Alan Cox <alan@redhat.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoasm-*/futex.h should include linux/uaccess.h
Jeff Dike [Wed, 30 Apr 2008 07:54:49 +0000]
asm-*/futex.h should include linux/uaccess.h

Lots of asm-*/futex.h call pagefault_enable and pagefault_disable, which
are declared in linux/uaccess.h, without including linux/uaccess.h.

They all include asm/uaccess.h, so this patch replaces asm/uaccess.h
with linux/uaccess.h.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agosysv: [bl]e*_add_cpu conversion
Marcin Slusarz [Wed, 30 Apr 2008 07:54:49 +0000]
sysv: [bl]e*_add_cpu conversion

replace all:
big/little_endian_variable = cpu_to_[bl]eX([bl]eX_to_cpu(big/little_endian_variable) +
expression_in_cpu_byteorder);
with:
[bl]eX_add_cpu(&big/little_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoquota: le*_add_cpu conversion
Marcin Slusarz [Wed, 30 Apr 2008 07:54:48 +0000]
quota: le*_add_cpu conversion

replace all:
little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) +
expression_in_cpu_byteorder);
with:
leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agohfs/hfsplus: be*_add_cpu conversion
Marcin Slusarz [Wed, 30 Apr 2008 07:54:47 +0000]
hfs/hfsplus: be*_add_cpu conversion

replace all:
big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) +
expression_in_cpu_byteorder);
with:
beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoaffs: be*_add_cpu conversion
Marcin Slusarz [Wed, 30 Apr 2008 07:54:47 +0000]
affs: be*_add_cpu conversion

replace all:
big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) +
expression_in_cpu_byteorder);
with:
beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agoreiserfs: use open_bdev_excl
Christoph Hellwig [Wed, 30 Apr 2008 07:54:46 +0000]
reiserfs: use open_bdev_excl

Use the proper helper to open a blockdevice by name for filesystem use,
this makes sure it's properly claimed (also added for open-by-number) and
gets rid of the struct file abuse.

Tested by mounting a reiserfs filesystem with external journal.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Acked-by: Edward Shishkin <edward.shishkin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agofuse: fix sparse warnings
Miklos Szeredi [Wed, 30 Apr 2008 07:54:45 +0000]
fuse: fix sparse warnings

fs/fuse/dev.c:306:2: warning: context imbalance in 'wait_answer_interruptible' - unexpected unlock
fs/fuse/dev.c:361:2: warning: context imbalance in 'request_wait_answer' - unexpected unlock
fs/fuse/dev.c:1002:4: warning: context imbalance in 'end_io_requests' - unexpected unlock

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agofuse: fix race in llseek
Miklos Szeredi [Wed, 30 Apr 2008 07:54:45 +0000]
fuse: fix race in llseek

Fuse doesn't use i_mutex to protect setting i_size, and so
generic_file_llseek() can be racy: it doesn't use i_size_read().

So do a fuse specific llseek method, which does use i_size_read().

[akpm@linux-foundation.org: make `retval' loff_t]
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agofuse: fix node ID type
Miklos Szeredi [Wed, 30 Apr 2008 07:54:44 +0000]
fuse: fix node ID type

Node ID is 64bit but it is passed as unsigned long to some functions.  This
breakage wasn't noticed, because libfuse uses unsigned long too.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agofuse: fix max i/o size calculation
Miklos Szeredi [Wed, 30 Apr 2008 07:54:44 +0000]
fuse: fix max i/o size calculation

Fix a bug that Werner Baumann reported: fuse can send a bigger write request
than the maximum specified.  This only affected direct_io operation.

In addition set a sane minimum for the max_read and max_write tunables, so I/O
always makes some progress.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agofuse: update file size on short read
Miklos Szeredi [Wed, 30 Apr 2008 07:54:43 +0000]
fuse: update file size on short read

If the READ request returned a short count, then either

  - cached size is incorrect
  - filesystem is buggy, as short reads are only allowed on EOF

So assume that the size is wrong and refresh it, so that cached read() doesn't
zero fill the missing chunk.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agofuse: implement perform_write
Nick Piggin [Wed, 30 Apr 2008 07:54:42 +0000]
fuse: implement perform_write

Introduce fuse_perform_write.  With fusexmp (a passthrough filesystem), large
(1MB) writes into a backing tmpfs filesystem are sped up by almost 4 times
(256MB/s vs 71MB/s).

[mszeredi@suse.cz]:

 - split into smaller functions
 - testing
 - duplicate generic_file_aio_write(), so that there's no need to add a
   new ->perform_write() a_op.  Comment from hch.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agofuse: clean up setting i_size in write
Miklos Szeredi [Wed, 30 Apr 2008 07:54:41 +0000]
fuse: clean up setting i_size in write

Extract common code for setting i_size in write functions into a common
helper.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

11 years agofuse: support writable mmap
Miklos Szeredi [Wed, 30 Apr 2008 07:54:41 +0000]
fuse: support writable mmap

Quoting Linus (3 years ago, FUSE inclusion discussions):

  "User-space filesystems are hard to get right. I'd claim that they
   are almost impossible, unless you limit them somehow (shared
   writable mappings are the nastiest part - if you don't have those,
   you can reasonably limit your problems by limiting the number of
   dirty pages you accept through normal "write()" calls)."

Instead of attempting the impossible, I've just waited for the dirty page
accounting infrastructure to materialize (thanks to Peter Zijlstra and
others).  This nicely solved the biggest problem: limiting the number of pages
used for write caching.

Some small details remained, however, which this largish patch attempts to
address.  It provides a page writeback implementation for fuse, which is
completely safe against VM related deadlocks.  Performance may not be very
good for certain usage patterns, but generally it should be acceptable.

It has been tested extensively with fsx-linux and bash-shared-mapping.

Fuse page writeback design
--------------------------

fuse_writepage() allocates a new temporary page with GFP_NOFS|__GFP_HIGHMEM.
It copies the contents of the original page, and queues a WRITE request to the
userspace filesystem using this temp page.

The writeback is finished instantly from the MM's point of view: the page is
removed from the radix trees, and the PageDirty and PageWriteback flags are
cleared.

For the duration of the actual write, the NR_WRITEBACK_TEMP counter is
incremented.  The per-bdi writeback count is not decremented until the actual
write completes.

On dirtying the page, fuse waits for a previous write to finish before
proceeding.  This makes sure, there can only be one temporary page used at a
time for one cached page.

This approach is wasteful in both memory and CPU bandwidth, so why is this
complication needed?

The basic problem is that there can be no guarantee about the time in which
the userspace filesystem will complete a write.  It may be buggy or even
malicious, and fail to complete WRITE requests.  We don't want unrelated parts
of the system to grind to a halt in such cases.

Also a filesystem may need additional resources (particularly memory) to
complete a WRITE request.  There's a great danger of a deadlock if that
allocation may wait for the writepage to finish.

Currently there are several cases where the kernel can block on page
writeback:

  - allocation order is larger than PAGE_ALLOC_COSTLY_ORDER
  - page migration
  - throttle_vm_writeout (through NR_WRITEBACK)
  - sync(2)

Of course in some cases (fsync, msync) we explicitly want to allow blocking.
So for these cases new code has to be added to fuse, since the VM is not
tracking writeback pages for us any more.

As an extra safetly measure, the maximum dirty ratio allocated to a single
fuse filesystem is set to 1% by default.  This way one (or several) buggy or
malicious fuse filesystems cannot slow down the rest of the system by hogging
dirty memory.

With appropriate privileges, this limit can be raised through
'/sys/class/bdi/<bdi>/max_ratio'.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>