9 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
David S. Miller [Tue, 27 Oct 2009 08:03:26 +0000]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6

Conflicts:
drivers/net/sh_eth.c

9 years agovlan: allow null VLAN ID to be used
Eric Dumazet [Tue, 27 Oct 2009 01:40:35 +0000]
vlan: allow null VLAN ID to be used

We currently use a 16 bit field (vlan_tci) to store VLAN ID/PRIO on a skb.

Null value is used as a special value, meaning vlan tagging not enabled.
This forbids use of null vlan ID.

As pointed by David, some drivers use the 3 high order bits (PRIO)

As VLAN ID is 12 bits, we can use the remaining bit (CFI) as a flag, and
allow null VLAN ID.

In case future code really wants to use VLAN_CFI_MASK, we'll have to use
a bit outside of vlan_tci.

#define VLAN_PRIO_MASK         0xe000 /* Priority Code Point */
#define VLAN_PRIO_SHIFT        13
#define VLAN_CFI_MASK          0x1000 /* Canonical Format Indicator */
#define VLAN_TAG_PRESENT       VLAN_CFI_MASK
#define VLAN_VID_MASK          0x0fff /* VLAN Identifier */

Reported-by: Gertjan Hofman <gertjan_hofman@yahoo.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agocan: sja1000: fix bug using library functions for skb allocation
Kurt Van Dijck [Tue, 27 Oct 2009 00:33:59 +0000]
can: sja1000: fix bug using library functions for skb allocation

Commit 7b6856a0 "can: provide library functions for skb allocation"
did not properly remove two lines of the SJA1000 driver resulting in
a 'skb_over_panic' when calling skb_put, as reported by Kurt.

Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agocxgb3: Set the rxq
Krishna Kumar [Fri, 23 Oct 2009 01:13:21 +0000]
cxgb3: Set the rxq

Set the rxq# for LRO when processing the last fragment of a
frame. This helps in fast txq selection for routing workloads.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosh_eth: Add asm/cacheflush.h
Nobuhiro Iwamatsu [Mon, 26 Oct 2009 13:49:50 +0000]
sh_eth: Add asm/cacheflush.h

Add include asm/cacheflush.h,  because declaration of __flush_purge_region
moved to asm/cacheflush.h.

Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoPPPoE: Fix flush/close races.
Michal Ostrowski [Mon, 26 Oct 2009 23:23:20 +0000]
PPPoE: Fix flush/close races.

Be more careful about the state of pointers during tear-down.
The "pppoe_dev" field can only be looked at safely while holding socket locks.
This subsequently allows for the flush_lock to be killed.

We depend on the PPPOX_CONNECTED state to tell us that that those fields are
valid, so whoever clears that state (pppox_unbind_sock()) is responsible for
the dev_put() call.

We also have to ensure that we delete_item() on all sockets before they are
cleaned up.

The need for these changes has been exposed by scenarios wherein namespace
bindings of ethernet devices change while there are ongoing PPPoE sessions,
which resulted in oopses due to unusual socket connection termination paths,
exposing these issues.

Signed-off-by: Michal Ostrowski <mostrows@gmail.com>
Reviewed-by: Cyril Gorcunov <gorcunov@gmail.com>
Reported-by: Denys Fedoryschenko <denys@visp.net.lb>
Tested-by: Denys Fedoryschenko <denys@visp.net.lb>

9 years agoe1000e: allow for swflag to be held over consecutive PHY accesses
Bruce Allan [Mon, 26 Oct 2009 11:24:02 +0000]
e1000e: allow for swflag to be held over consecutive PHY accesses

PCH-based parts (82577/82578) and some ICH8-based parts (82566) need to
hold the swflag (sw/fw/hw hardware semaphore) over consecutive PHY accesses
in order to perform sw-driven PHY configuration during initialization to
workaround known hardware issues (see follow-on patch).  This patch
provides new PHY read/write functions (and function pointers) that will
allow accessing the PHY registers assuming the swflag has already been
acquired.  The actual PHY register access code has moved into helper
functions that are called with a flag indicating whether or not the swflag
has already been acquired and acquires/releases it if not.

The functions called from within the updated PHY access functions had to be
updated to assume the swflag was already acquired, and other functions that
called those functions were also updated to acquire/release the swflag.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoe1000e: separate mutex usage between NVM and PHY/CSR register for ICHx/PCH
Bruce Allan [Mon, 26 Oct 2009 11:23:43 +0000]
e1000e: separate mutex usage between NVM and PHY/CSR register for ICHx/PCH

Accesses to NVM and PHY/CSR registers on ICHx/PCH-based parts are protected
from concurrent accesses with a mutex that is acquired when the access is
initiated and released when the access has completed.  However, the two
types of accesses should not be protected by the same mutex because the
driver may have to access the NVM while already holding the mutex over
several consecutive PHY/CSR accesses which would result in livelock.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoe1000e: 82577/82578 requires a different method to configure LPLU
Bruce Allan [Mon, 26 Oct 2009 11:23:25 +0000]
e1000e: 82577/82578 requires a different method to configure LPLU

Unlike previous ICHx-based parts, the PCH-based parts (82577/82578) require
LPLU (Low Power Link Up, or "reverse auto-negotiation") to be configured in
the PHY rather than the MAC.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoe1000e: increase swflag acquisition timeout for ICHx/PCH
Bruce Allan [Mon, 26 Oct 2009 11:23:06 +0000]
e1000e: increase swflag acquisition timeout for ICHx/PCH

In some conditions (e.g. when AMT is enabled on the system), it is possible
to take an extended period of time to for the driver to acquire the sw/fw/hw
hardware semaphore used to protect against concurrent access of a shared
resource (e.g. PHY registers).  This could cause PHY registers to not get
configured properly resulting in link issues.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoe1000e: clear PHY wakeup bit after LCD reset on 82577/82578
Bruce Allan [Mon, 26 Oct 2009 11:22:47 +0000]
e1000e: clear PHY wakeup bit after LCD reset on 82577/82578

Performing a dummy read of the PHY Wakeup Control (WUC) register clears the
wakeup enable bit set by an PHY reset.  If this bit remains set, link
problems may occur.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoigbvf: fix memory leak when ring size changed while interface down
Alexander Duyck [Mon, 26 Oct 2009 11:32:25 +0000]
igbvf: fix memory leak when ring size changed while interface down

This patch resolves a memory leak which occurs while changing the ring size
while the interface is down.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoixgbe: fix memory leak when resizing rings while interface is down
Alexander Duyck [Mon, 26 Oct 2009 11:32:05 +0000]
ixgbe: fix memory leak when resizing rings while interface is down

This patch resolves a memory leak that occurs when you resize the rings via
the ethtool -G option while the interface is down.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoigb: fix memory leak when setting ring size while interface is down
Alexander Duyck [Mon, 26 Oct 2009 11:31:47 +0000]
igb: fix memory leak when setting ring size while interface is down

Changing ring sizes while the interface was down was causing a double
allocation of the receive and transmit rings.  This issue is amplified when
there are multiple rings enabled.  To prevent this we need to add an
additional check which will just update the ring counts when the interface
is not up and skip the allocation steps.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobonding: Modify hash transmit policies to use the packet's source MAC address
Jasper Spaans [Fri, 23 Oct 2009 04:08:46 +0000]
bonding: Modify hash transmit policies to use the packet's source MAC address

Modify bonding hash transmit policies to use the psource MAC address of
the packet instead of the MAC address configured for the bonding device.

The old sitation conflicts with the documentation.

Signed-off-by: Jasper Spaans <spaans@fox-it.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agopktgen: Dont leak kernel memory
Eric Dumazet [Sat, 24 Oct 2009 13:55:20 +0000]
pktgen: Dont leak kernel memory

While playing with pktgen, I realized IP ID was not filled and a
random value was taken, possibly leaking 2 bytes of kernel memory.

We can use an increasing ID, this can help diagnostics anyway.

Also clear packet payload, instead of leaking kernel memory.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoDM9000: Fix revision ID for DM9000B
Ben Dooks [Sat, 24 Oct 2009 13:53:07 +0000]
DM9000: Fix revision ID for DM9000B

The DM9000B revision ID is 0x1A, not 0x1B as set in the curernt
dm9000.h header.

Fix bug reported by Paolo Zebelloni.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: Simtec Linux Team <linux@simtec.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agor8169: fix Ethernet Hangup for RTL8110SC rev d
Simon Wunderlich [Sat, 24 Oct 2009 13:47:33 +0000]
r8169: fix Ethernet Hangup for RTL8110SC rev d

The 8110SC rev d chip on our board shows a regression which the 8110SB chip
did not have. When inbound traffic is overflowing the receive descriptor queue,
"holes" in the ring buffer may occur which lead to a hangup until the buffer
is filled again. The packets are than completely processed, but the ring
remains porous and no packets are processed until the next overflow. Setting
the interface down and up can fix the problem temporary from userspace.

For some reason we don't know, this behaviour is not occuring if the RxVlan
bit for hardware VLAN untagging is set. There is another "Work around for
AMD plateform" in the current code which checks the VLAN status
word in receive descriptors, but does never come to effect when hardware
VLAN support is enabled. We assume that this is a bug in the chip.

The following patch fixes the problem. Without the patch we could reproduce
the hang within minutes (given other devices also generating lots of
interrupts), without we couldn't reproduce within a few days of long term
testing.

This version contains minor style adjustments and is sent with mutt which
will hopefully not destroy the formatting again.

Signed-off-by: Bernhard Schmidt <bernhard.schmidt@saxnet.de>
Signed-off-by: Simon Wunderlich <simon.wunderlich@saxnet.de>
Acked-by: Francois Romieu <romieu@zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agortnetlink: speedup rtnl_dump_ifinfo()
Eric Dumazet [Sat, 24 Oct 2009 13:13:17 +0000]
rtnetlink: speedup rtnl_dump_ifinfo()

When handling large number of netdevice, rtnl_dump_ifinfo()
is very slow because it has O(N^2) complexity.

Instead of scanning one single list, we can use the 256 sub lists
of the dev_index hash table.

This considerably speedups "ip link" operations

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agogre: convert hash tables locking to RCU
Eric Dumazet [Fri, 23 Oct 2009 06:14:38 +0000]
gre: convert hash tables locking to RCU

GRE tunnels use one rwlock to protect their hash tables.

This locking scheme can be converted to RCU for free, since netdevice
already must wait for a RCU grace period at dismantle time.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoip6tnl: convert hash tables locking to RCU
Eric Dumazet [Fri, 23 Oct 2009 06:34:34 +0000]
ip6tnl: convert hash tables locking to RCU

ip6_tunnels use one rwlock to protect their hash tables.

This locking scheme can be converted to RCU for free, since netdevice
already must wait for a RCU grace period at dismantle time.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipip: convert hash tables locking to RCU
Eric Dumazet [Fri, 23 Oct 2009 05:42:02 +0000]
ipip: convert hash tables locking to RCU

IPIP tunnels use one rwlock to protect their hash tables.

This locking scheme can be converted to RCU for free, since netdevice
already must wait for a RCU grace period at dismantle time.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoxfrm6_tunnel: RCU conversion
Eric Dumazet [Fri, 23 Oct 2009 18:19:19 +0000]
xfrm6_tunnel: RCU conversion

xfrm6_tunnels use one rwlock to protect their hash tables.

Plain and straightforward conversion to RCU locking to permit better SMP
performance.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv6 sit: RCU conversion phase II
Eric Dumazet [Fri, 23 Oct 2009 17:52:14 +0000]
ipv6 sit: RCU conversion phase II

SIT tunnels use one rwlock to protect their hash tables.

This locking scheme can be converted to RCU for free, since netdevice
already must wait for a RCU grace period at dismantle time.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv6 sit: RCU conversion phase I
Eric Dumazet [Fri, 23 Oct 2009 17:51:26 +0000]
ipv6 sit: RCU conversion phase I

SIT tunnels use one rwlock to protect their prl entries.

This first patch adds RCU locking for prl management,
with standard call_rcu() calls.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Rename 'xfp' file and functions to reflect reality
Ben Hutchings [Fri, 23 Oct 2009 08:33:42 +0000]
sfc: Rename 'xfp' file and functions to reflect reality

The 'XFP' driver is really a driver for the QT2022C2 and QT2025C PHYs,
covering both more and less than XFP.  Rename its functions and
constants to reflect reality and to reduce namespace pollution when
sfc is a built-in driver.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Remove unused code for non-autoneg speed/duplex switching
Ben Hutchings [Fri, 23 Oct 2009 08:33:27 +0000]
sfc: Remove unused code for non-autoneg speed/duplex switching

The only multi-speed PHY driver using this is 10Xpress, and it does
not support non-autoneg operation.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Merge efx_fc_resolve() into efx_mdio_get_pause()
Ben Hutchings [Fri, 23 Oct 2009 08:33:17 +0000]
sfc: Merge efx_fc_resolve() into efx_mdio_get_pause()

efx_fc_resolve() is specific to MDIO and is not used by any other
function.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Move MTD probe after netdev registration and name allocation
Ben Hutchings [Fri, 23 Oct 2009 08:33:09 +0000]
sfc: Move MTD probe after netdev registration and name allocation

The MTD partition is named based on the netdev name, which is set to
'eth%d' before registration.  Also, the MTD partition will currently
be left registered if netdev registration fails.

Fix both these problems by moving the MTD probe after netdev
registration.  Hold the RTNL to serialise this with the netdev
notifier that calls efx_mtd_rename().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Remove unnecessary tests of efx->membase
Ben Hutchings [Fri, 23 Oct 2009 08:33:00 +0000]
sfc: Remove unnecessary tests of efx->membase

These cleanup functions will never be called if the MMIO region could
not be mapped.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Remove incorrect assertion from efx_pci_remove_main()
Ben Hutchings [Fri, 23 Oct 2009 08:32:51 +0000]
sfc: Remove incorrect assertion from efx_pci_remove_main()

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Merge falcon_probe_phy() into falcon_probe_port()
Ben Hutchings [Fri, 23 Oct 2009 08:32:42 +0000]
sfc: Merge falcon_probe_phy() into falcon_probe_port()

MAC and PHY probing are bound up together, as evidenced by the
initialisation of efx_nic::loopback_modes.  Remove the current
arbitrary separation.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Remove pointless abstraction of memory BAR number
Ben Hutchings [Fri, 23 Oct 2009 08:32:33 +0000]
sfc: Remove pointless abstraction of memory BAR number

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Removed kernel-doc for nonexistent member of efx_phy_operations
Ben Hutchings [Fri, 23 Oct 2009 08:32:22 +0000]
sfc: Removed kernel-doc for nonexistent member of efx_phy_operations

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Maintain interrupt moderation values in ticks, not microseconds
Ben Hutchings [Fri, 23 Oct 2009 08:32:13 +0000]
sfc: Maintain interrupt moderation values in ticks, not microseconds

This simplifies the implementation a lot.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Move shared members of struct falcon_nic_data into struct efx_nic
Ben Hutchings [Fri, 23 Oct 2009 08:32:04 +0000]
sfc: Move shared members of struct falcon_nic_data into struct efx_nic

These will also be used with Siena NICs.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Move efx_xmit_done() declaration into correct stanza
Ben Hutchings [Fri, 23 Oct 2009 08:31:54 +0000]
sfc: Move efx_xmit_done() declaration into correct stanza

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Remove declarations of nonexistent functions
Ben Hutchings [Fri, 23 Oct 2009 08:31:46 +0000]
sfc: Remove declarations of nonexistent functions

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Change order of device removal to reverse of probe order
Ben Hutchings [Fri, 23 Oct 2009 08:31:37 +0000]
sfc: Change order of device removal to reverse of probe order

This makes efx_pci_remove_main() more obviously the inverse of
efx_pci_probe_main(), and matches our out-of-tree driver.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Merge struct efx_blinker into struct efx_board
Ben Hutchings [Fri, 23 Oct 2009 08:31:29 +0000]
sfc: Merge struct efx_blinker into struct efx_board

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Move all TX DMA length limiting into tx.c
Ben Hutchings [Fri, 23 Oct 2009 08:31:20 +0000]
sfc: Move all TX DMA length limiting into tx.c

Replace the duplicated logic in efx_enqueue_skb() and
efx_tx_queue_insert() with an inline function, efx_max_tx_len().

Remove the failed attempt at abstracting hardware-specifics and put
all the magic numbers in efx_max_tx_len().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Define DMA address mask explicitly in terms of descriptor field width
Ben Hutchings [Fri, 23 Oct 2009 08:31:07 +0000]
sfc: Define DMA address mask explicitly in terms of descriptor field width

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Eliminate indirect lookups of queue size constants
Ben Hutchings [Fri, 23 Oct 2009 08:30:58 +0000]
sfc: Eliminate indirect lookups of queue size constants

Move size and mask definitions into efx.h; calculate page orders in falcon.c.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Rename register I/O header and functions used by both Falcon and Siena
Ben Hutchings [Fri, 23 Oct 2009 08:30:46 +0000]
sfc: Rename register I/O header and functions used by both Falcon and Siena

While we're at it, use type suffixes of 'd', 'q' and 'o', consistent
with register type names.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Update hardware definitions for Siena
Ben Hutchings [Fri, 23 Oct 2009 08:30:36 +0000]
sfc: Update hardware definitions for Siena

Siena is still based on the Falcon hardware architecture and will
share many of these definitions, so replace falcon_hwdefs.h with
regs.h.

The new definitions have been generated according to a naming
convention which incorporates the type and revision information.
Update the code accordingly.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Move RX data FIFO thresholds out of struct efx_nic_type
Ben Hutchings [Fri, 23 Oct 2009 08:30:17 +0000]
sfc: Move RX data FIFO thresholds out of struct efx_nic_type

Since there are now separate blocks of code to set the thresholds for
each NIC type, it is no longer useful to include them in the NIC type
description.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Remove versioned bitfield macros
Ben Hutchings [Fri, 23 Oct 2009 08:30:06 +0000]
sfc: Remove versioned bitfield macros

These macros are not extensible to more than two NIC types without
repetition of register definitions, and they are only used to deal
with a few fields in RX_CFG_REG and global events which moved between
Falcon rev A1 and B0.

Therefore:
- Move RX_CFG_REG initialisation into its own function which tests the
  NIC revision just once
- Explicitly test the NIC revision when checking the RX_RECOVERY flag in
  global events
- Merge definitions of RX_XOFF_MAC_EN flag, which did not move
- Remove the macro definitions

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Remove boards.h, moving last remaining declaration to falcon.h
Ben Hutchings [Fri, 23 Oct 2009 08:29:51 +0000]
sfc: Remove boards.h, moving last remaining declaration to falcon.h

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Merge sfe4001.c into falcon_boards.c
Ben Hutchings [Fri, 23 Oct 2009 08:29:33 +0000]
sfc: Merge sfe4001.c into falcon_boards.c

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Rename Falcon-specific board code and types
Ben Hutchings [Fri, 23 Oct 2009 08:29:16 +0000]
sfc: Rename Falcon-specific board code and types

Siena will require entirely different board code.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Remove redundant hardware initialisation
Ben Hutchings [Fri, 23 Oct 2009 08:28:53 +0000]
sfc: Remove redundant hardware initialisation

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: Remove redundant header gmii.h
Ben Hutchings [Fri, 23 Oct 2009 08:28:45 +0000]
sfc: Remove redundant header gmii.h

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agopkt_sched: skbedit add support for setting mark
jamal [Thu, 15 Oct 2009 03:09:18 +0000]
pkt_sched: skbedit add support for setting mark

This adds support for setting the skb mark.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoifb: should not use __dev_get_by_index() without locks
Eric Dumazet [Tue, 20 Oct 2009 02:35:50 +0000]
ifb: should not use __dev_get_by_index() without locks

At this point (ri_tasklet()), RTNL or dev_base_lock are not held,
we must use dev_get_by_index() instead of __dev_get_by_index()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: au1000_eth: add missing capability.h
Manuel Lauss [Sat, 17 Oct 2009 02:00:07 +0000]
net: au1000_eth: add missing capability.h

fixes the following build failure:
CC drivers/net/au1000_eth.o
/drivers/net/au1000_eth.c: In function 'au1000_set_settings':
/drivers/net/au1000_eth.c:623: error: implicit declaration of function 'capable'
/drivers/net/au1000_eth.c:623: error: 'CAP_NET_ADMIN' undeclared (first use in this function)
/drivers/net/au1000_eth.c:623: error: (Each undeclared identifier is reported only once
/drivers/net/au1000_eth.c:623: error: for each function it appears in.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agomyri10ge: improve port type reporting in ethtool
Brice Goglin [Fri, 23 Oct 2009 04:43:43 +0000]
myri10ge: improve port type reporting in ethtool

Improve the reporting of myri10ge port type in ethtool,
and update for new boards.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: use WARN() for the WARN_ON in commit b6b39e8f3fbbb
Arjan van de Ven [Fri, 23 Oct 2009 04:37:56 +0000]
net: use WARN() for the WARN_ON in commit b6b39e8f3fbbb

Commit b6b39e8f3fbbb (tcp: Try to catch MSG_PEEK bug) added a printk()
to the WARN_ON() that's in tcp.c. This patch changes this combination
to WARN(); the advantage of WARN() is that the printk message shows up
inside the message, so that kerneloops.org will collect the message.

In addition, this gets rid of an extra if() statement.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoe1000e: reset the PHY on 82577/82578 when going to Sx
Bruce Allan [Fri, 23 Oct 2009 04:22:18 +0000]
e1000e: reset the PHY on 82577/82578 when going to Sx

The PHY on 82577/82578 parts needs a soft reset when transitioning to Sx
state in order for the PHY write which disables gigabit speed to take
effect.  Gigabit speed must be disabled in order for the PHY writes to
registers on page 800 (the wakeup control registers) to work as expected
otherwise the system might not wake via WoL.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosfc: 10Xpress: Report support for pause frames
Ben Hutchings [Fri, 23 Oct 2009 01:31:39 +0000]
sfc: 10Xpress: Report support for pause frames

Commits 27fbc7d 'mdio: Expose pause frame advertising flags to ethtool'
and c634263 'sfc: 10Xpress: Initialise pause advertising flags'
added to our reported advertising flags.

efx_mdio_set_settings() requires that all advertising flags are
also present in the supported flags, so make sure that is true.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoisdn: fix possible circular locking dependency
Xiaotian Feng [Wed, 21 Oct 2009 23:07:04 +0000]
isdn: fix possible circular locking dependency

There's a circular locking dependency:

---> isdn_net_get_locked_lp
    --->lock &nd->queue_lock
    --->lock &nd->queue->xmit_lock
    .....................
    ---->unlock &nd->queue_lock

---> isdn_net_writebuf_skb (called with &nd->queue->xmit_lock locked)
    ---->isdn_net_inc_frame_cnt
         ---->isdn_net_device_busy
              ----> lock &nd->queue_lock

This will trigger lockdep warnings:

 =======================================================
 [ INFO: possible circular locking dependency detected ]
 2.6.32-rc4-testing #7
 -------------------------------------------------------
 ipppd/28379 is trying to acquire lock:
 (&netdev->queue_lock){......}, at: [<e62ad0fd>] isdn_net_device_busy+0x2c/0x74 [isdn]

 but task is already holding lock:
 (&netdev->local->xmit_lock){+.....}, at: [<e62aefc2>] isdn_net_write_super+0x3f/0x6e [isdn]

 which lock already depends on the new lock.
 .......

 We don't need to lock nd->queue->xmit_lock to protect single
isdn_net_lp_busy(). This can fix above lockdep warnings.

Reported-and-tested-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: Xiaotian Feng <xtfeng@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonetxen: avoid undue board config check
Dhananjay Phadke [Wed, 21 Oct 2009 19:39:03 +0000]
netxen: avoid undue board config check

Old code assumed board config version in the flash to be 1.
When this will get changed by tools, driver just refuses to
attach. This is unnecessary since driver does not have to
parse board config structure directly (maintained by firmware).

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonetxen: fix tx timeout handling on firmware hang
Amit Kumar Salecha [Wed, 21 Oct 2009 19:39:02 +0000]
netxen: fix tx timeout handling on firmware hang

Clear NX_RESETING bit in netxen_tx_timeout_task() so that
the firmware watchdog task can catch need_reset request
from tx timeout.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonetxen: fix i2c init
Dhananjay Phadke [Wed, 21 Oct 2009 19:39:01 +0000]
netxen: fix i2c init

Avoid resetting subsys ID in i2c block. Also remove duplicate
check for address tranlsation error.

Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agortnetlink: rtnl_setlink() and rtnl_getlink() changes
Eric Dumazet [Wed, 21 Oct 2009 10:59:31 +0000]
rtnetlink: rtnl_setlink() and rtnl_getlink() changes

rtnl_getlink() & rtnl_setlink() run with RTNL held, we can use
__dev_get_by_index() and __dev_get_by_name() variants and avoid
dev_hold()/dev_put()

Adds to rtnl_getlink() the capability to find a device by its name,
not only by its index.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoqlge: Add ethtool register dump function.
Ron Mercer [Wed, 21 Oct 2009 11:07:41 +0000]
qlge: Add ethtool register dump function.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoqlge: Add ethtool wake on LAN function.
Ron Mercer [Wed, 21 Oct 2009 11:07:40 +0000]
qlge: Add ethtool wake on LAN function.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoqlge: Add ethtool blink function.
Ron Mercer [Wed, 21 Oct 2009 11:07:39 +0000]
qlge: Add ethtool blink function.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoqlge: Add ethtool get/set pause parameter.
Ron Mercer [Wed, 21 Oct 2009 11:07:38 +0000]
qlge: Add ethtool get/set pause parameter.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoniu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied...
Joyce Yu [Thu, 22 Oct 2009 00:21:10 +0000]
niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied to the head buffer in the Vlan packets case

Signed-off-by: Joyce Yu <joyce.yu@sun.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoKS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST
Ben Dooks [Mon, 19 Oct 2009 23:49:05 +0000]
KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST

In ks8851_set_rx_mode() the case handling IFF_MULTICAST was also setting
the RXCR1_AE bit by accident. This meant that all unicast frames where
being accepted by the device. Remove RXCR1_AE from this case.

Note, RXCR1_AE was also masking a problem with setting the MAC address
properly, so needs to be applied after fixing the MAC write order.

Fixes a bug reported by Doong, Ping of Micrel. This version of the
patch avoids setting RXCR1_ME for all cases.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoKS8851: Fix MAC address write order
Ben Dooks [Mon, 19 Oct 2009 23:49:04 +0000]
KS8851: Fix MAC address write order

The MAC address register was being written in the wrong order, so add
a new address macro to convert mac-address byte to register address and
a ks8851_wrreg8() function to write each byte without having to worry
about any difficult byte swapping.

Fixes a bug reported by Doong, Ping of Micrel.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoKS8851: Add soft reset at probe time
Ben Dooks [Mon, 19 Oct 2009 23:49:03 +0000]
KS8851: Add soft reset at probe time

Issue a full soft reset at probe time.

This was reported by Doong Ping of Micrel, but no explanation of why this
is necessary or what bug it is fixing. Add it as it does not seem to hurt
the current driver and ensures that the device is in a known state when we
start setting it up.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: Use sk_tx_queue_mapping for connected sockets
Krishna Kumar [Mon, 19 Oct 2009 23:50:07 +0000]
net: Use sk_tx_queue_mapping for connected sockets

For connected sockets, the first run of dev_pick_tx saves the
calculated txq in sk_tx_queue_mapping. This is not saved if
either the device has a queue select or the socket is not
connected. Next iterations of dev_pick_tx uses the cached value
of sk_tx_queue_mapping.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: Fix for dst_negative_advice
Krishna Kumar [Mon, 19 Oct 2009 23:46:45 +0000]
net: Fix for dst_negative_advice

dst_negative_advice() should check for changed dst and reset
sk_tx_queue_mapping accordingly. Pass sock to the callers of
dst_negative_advice.

(sk_reset_txq is defined just for use by dst_negative_advice. The
only way I could find to get around this is to move dst_negative_()
from dst.h to dst.c, include sock.h in dst.c, etc)

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: IPv6 changes
Krishna Kumar [Mon, 19 Oct 2009 23:46:32 +0000]
net: IPv6 changes

IPv6: Reset sk_tx_queue_mapping when dst_cache is reset. Use existing
macro to do the work.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: Introduce sk_tx_queue_mapping
Krishna Kumar [Mon, 19 Oct 2009 23:46:20 +0000]
net: Introduce sk_tx_queue_mapping

Introduce sk_tx_queue_mapping; and functions that set, test and
get this value. Reset sk_tx_queue_mapping to -1 whenever the dst
cache is set/reset, and in socket alloc. Setting txq to -1 and
using valid txq=<0 to n-1> allows the tx path to use the value
of sk_tx_queue_mapping directly instead of subtracting 1 on every
tx.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: fix section mismatch in fec.c
Steven King [Wed, 21 Oct 2009 01:51:37 +0000]
net: fix section mismatch in fec.c

fec_enet_init is called by both fec_probe and fec_resume, so it
shouldn't be marked as __init.

Signed-off-by: Steven King <sfking@fdwdc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: Fix struct inet_timewait_sock bitfield annotation
Eric Dumazet [Sun, 18 Oct 2009 22:48:51 +0000]
net: Fix struct inet_timewait_sock bitfield annotation

commit 9e337b0f (net: annotate inet_timewait_sock bitfields)
added 4/8 bytes in struct inet_timewait_sock.

Fix this by declaring tw_ipv6_offset in the 'flags' bitfield
The 14 bits hole is named tw_pad to make it cleary apparent.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: Avoid compiler warning for mmsghdr when CONFIG_COMPAT is not selected
Arnaldo Carvalho de Melo [Tue, 20 Oct 2009 08:09:17 +0000]
net: Avoid compiler warning for mmsghdr when CONFIG_COMPAT is not selected

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agofilter: Add SKF_AD_QUEUE instruction
Eric Dumazet [Tue, 20 Oct 2009 08:06:22 +0000]
filter: Add SKF_AD_QUEUE instruction

It can help being able to filter packets on their queue_mapping.

If filter performance is not good, we could add a "numqueue" field
in struct packet_type, so that netif_nit_deliver() and other functions
can directly ignore packets with not expected queue number.

Lets experiment this simple filter extension first.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoaf_packet: mc_drop/flush_mclist changes
Eric Dumazet [Fri, 16 Oct 2009 06:38:46 +0000]
af_packet: mc_drop/flush_mclist changes

We hold RTNL, we can use __dev_get_by_index() instead of dev_get_by_index()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoaf_packet: Avoid cache line dirtying
Eric Dumazet [Fri, 16 Oct 2009 04:02:20 +0000]
af_packet: Avoid cache line dirtying

While doing multiple captures, I found af_packet was dirtying cache line
containing its prot_hook.

This slow down machines where several cpus are necessary to handle capture
traffic, as each prot_hook is traversed for each packet coming in or out
the host.

This patches moves "struct packet_type prot_hook" to the end of
packet_sock, and uses a ____cacheline_aligned_in_smp to make sure
this remains shared by all cpus.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotcp: Try to catch MSG_PEEK bug
Herbert Xu [Mon, 19 Oct 2009 19:41:06 +0000]
tcp: Try to catch MSG_PEEK bug

This patch tries to print out more information when we hit the
MSG_PEEK bug in tcp_recvmsg.  It's been around since at least
2005 and it's about time that we finally fix it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agocan: provide library functions for skb allocation
Wolfgang Grandegger [Tue, 20 Oct 2009 07:08:01 +0000]
can: provide library functions for skb allocation

This patch makes the private functions alloc_can_skb() and
alloc_can_err_skb() of the at91_can driver public and adapts all
drivers to use these. While making the patch I realized, that
the skb's are *not* setup consistently. It's now done as shown
below:

  skb->protocol = htons(ETH_P_CAN);
  skb->pkt_type = PACKET_BROADCAST;
  skb->ip_summed = CHECKSUM_UNNECESSARY;
  *cf = (struct can_frame *)skb_put(skb, sizeof(struct can_frame));
  memset(*cf, 0, sizeof(struct can_frame));

The frame is zeroed out to avoid uninitialized data to be passed to
user space. Some drivers or library code did not set "pkt_type" or
"ip_summed". Also,  "__constant_htons()" should not be used for
runtime invocations, as pointed out by David Miller.

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoIP: Cleanups
John Dykstra [Tue, 20 Oct 2009 04:53:53 +0000]
IP: Cleanups

Use symbols instead of magic constants while checking PMTU discovery
setsockopt.

Remove redundant test in ip_rt_frag_needed() (done by caller).

Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoi2400m-sdio: select IWMC3200TOP in Kconfig
Tomas Winkler [Sat, 17 Oct 2009 09:09:36 +0000]
i2400m-sdio: select IWMC3200TOP in Kconfig

i2400m-sdio requires iwmc3200top for its operation

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoiwmc3200wifi: select IWMC3200TOP in Kconfig
Tomas Winkler [Sat, 17 Oct 2009 09:09:35 +0000]
iwmc3200wifi: select IWMC3200TOP in Kconfig

iwmc3200wifi requires iwmc3200top  for its operation

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoiwmc3200top: Add Intel Wireless MultiCom 3200 top driver.
Tomas Winkler [Sat, 17 Oct 2009 09:09:34 +0000]
iwmc3200top: Add Intel Wireless MultiCom 3200 top driver.

This patch adds Intel Wireless MultiCom 3200 top driver.
IWMC3200 is 4Wireless Com CHIP (GPS/BT/WiFi/WiMAX).
Top driver is responsible for device initialization and firmware download.
Firmware handled by top is responsible for top itself and
as well as bluetooth and GPS coms. (Wifi and WiMax provide their own firmware)
In addition top driver is used to retrieve firmware logs
and supports other debugging features

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agopkt_sched: ingress socket filter by mark
jamal [Mon, 19 Oct 2009 02:17:56 +0000]
pkt_sched: ingress socket filter by mark

Allow bpf to set a filter to drop packets that dont
match a specific mark

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoqlge: Size RX buffers based on MTU.
Ron Mercer [Mon, 19 Oct 2009 03:32:19 +0000]
qlge: Size RX buffers based on MTU.

Change RX large buffer size based on MTU. If pages are larger
than the MTU the page is divided up into multiple chunks and passed to
the hardware.  When pages are smaller than MTU each RX buffer can
contain be comprised of up to 2 pages.

Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: Fix IP_MULTICAST_IF
Eric Dumazet [Mon, 19 Oct 2009 06:41:58 +0000]
net: Fix IP_MULTICAST_IF

ipv4/ipv6 setsockopt(IP_MULTICAST_IF) have dubious __dev_get_by_index() calls.

This function should be called only with RTNL or dev_base_lock held, or reader
could see a corrupt hash chain and eventually enter an endless loop.

Fix is to call dev_get_by_index()/dev_put().

If this happens to be performance critical, we could define a new dev_exist_by_index()
function to avoid touching dev refcount.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobluetooth: static lock key fix
Dave Young [Sun, 18 Oct 2009 20:28:30 +0000]
bluetooth: static lock key fix

When shutdown ppp connection, lockdep waring about non-static key
will happen, it is caused by the lock is not initialized properly
at that time.

Fix with tuning the lock/skb_queue_head init order

[   94.339261] INFO: trying to register non-static key.
[   94.342509] the code is fine but needs lockdep annotation.
[   94.342509] turning off the locking correctness validator.
[   94.342509] Pid: 0, comm: swapper Not tainted 2.6.31-mm1 #2
[   94.342509] Call Trace:
[   94.342509]  [<c0248fbe>] register_lock_class+0x58/0x241
[   94.342509]  [<c024b5df>] ? __lock_acquire+0xb57/0xb73
[   94.342509]  [<c024ab34>] __lock_acquire+0xac/0xb73
[   94.342509]  [<c024b7fa>] ? lock_release_non_nested+0x17b/0x1de
[   94.342509]  [<c024b662>] lock_acquire+0x67/0x84
[   94.342509]  [<c04cd1eb>] ? skb_dequeue+0x15/0x41
[   94.342509]  [<c054a857>] _spin_lock_irqsave+0x2f/0x3f
[   94.342509]  [<c04cd1eb>] ? skb_dequeue+0x15/0x41
[   94.342509]  [<c04cd1eb>] skb_dequeue+0x15/0x41
[   94.342509]  [<c054a648>] ? _read_unlock+0x1d/0x20
[   94.342509]  [<c04cd641>] skb_queue_purge+0x14/0x1b
[   94.342509]  [<fab94fdc>] l2cap_recv_frame+0xea1/0x115a [l2cap]
[   94.342509]  [<c024b5df>] ? __lock_acquire+0xb57/0xb73
[   94.342509]  [<c0249c04>] ? mark_lock+0x1e/0x1c7
[   94.342509]  [<f8364963>] ? hci_rx_task+0xd2/0x1bc [bluetooth]
[   94.342509]  [<fab95346>] l2cap_recv_acldata+0xb1/0x1c6 [l2cap]
[   94.342509]  [<f8364997>] hci_rx_task+0x106/0x1bc [bluetooth]
[   94.342509]  [<fab95295>] ? l2cap_recv_acldata+0x0/0x1c6 [l2cap]
[   94.342509]  [<c02302c4>] tasklet_action+0x69/0xc1
[   94.342509]  [<c022fbef>] __do_softirq+0x94/0x11e
[   94.342509]  [<c022fcaf>] do_softirq+0x36/0x5a
[   94.342509]  [<c022fe14>] irq_exit+0x35/0x68
[   94.342509]  [<c0204ced>] do_IRQ+0x72/0x89
[   94.342509]  [<c02038ee>] common_interrupt+0x2e/0x34
[   94.342509]  [<c024007b>] ? pm_qos_add_requirement+0x63/0x9d
[   94.342509]  [<c038e8a5>] ? acpi_idle_enter_bm+0x209/0x238
[   94.342509]  [<c049d238>] cpuidle_idle_call+0x5c/0x94
[   94.342509]  [<c02023f8>] cpu_idle+0x4e/0x6f
[   94.342509]  [<c0534153>] rest_init+0x53/0x55
[   94.342509]  [<c0781894>] start_kernel+0x2f0/0x2f5
[   94.342509]  [<c0781091>] i386_start_kernel+0x91/0x96

Reported-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobluetooth: scheduling while atomic bug fix
Dave Young [Sun, 18 Oct 2009 20:24:41 +0000]
bluetooth: scheduling while atomic bug fix

Due to driver core changes dev_set_drvdata will call kzalloc which should be
in might_sleep context, but hci_conn_add will be called in atomic context

Like dev_set_name move dev_set_drvdata to work queue function.

oops as following:

Oct  2 17:41:59 darkstar kernel: [  438.001341] BUG: sleeping function called from invalid context at mm/slqb.c:1546
Oct  2 17:41:59 darkstar kernel: [  438.001345] in_atomic(): 1, irqs_disabled(): 0, pid: 2133, name: sdptool
Oct  2 17:41:59 darkstar kernel: [  438.001348] 2 locks held by sdptool/2133:
Oct  2 17:41:59 darkstar kernel: [  438.001350]  #0:  (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.+.}, at: [<faa1d2f5>] lock_sock+0xa/0xc [l2cap]
Oct  2 17:41:59 darkstar kernel: [  438.001360]  #1:  (&hdev->lock){+.-.+.}, at: [<faa20e16>] l2cap_sock_connect+0x103/0x26b [l2cap]
Oct  2 17:41:59 darkstar kernel: [  438.001371] Pid: 2133, comm: sdptool Not tainted 2.6.31-mm1 #2
Oct  2 17:41:59 darkstar kernel: [  438.001373] Call Trace:
Oct  2 17:41:59 darkstar kernel: [  438.001381]  [<c022433f>] __might_sleep+0xde/0xe5
Oct  2 17:41:59 darkstar kernel: [  438.001386]  [<c0298843>] __kmalloc+0x4a/0x15a
Oct  2 17:41:59 darkstar kernel: [  438.001392]  [<c03f0065>] ? kzalloc+0xb/0xd
Oct  2 17:41:59 darkstar kernel: [  438.001396]  [<c03f0065>] kzalloc+0xb/0xd
Oct  2 17:41:59 darkstar kernel: [  438.001400]  [<c03f04ff>] device_private_init+0x15/0x3d
Oct  2 17:41:59 darkstar kernel: [  438.001405]  [<c03f24c5>] dev_set_drvdata+0x18/0x26
Oct  2 17:41:59 darkstar kernel: [  438.001414]  [<fa51fff7>] hci_conn_init_sysfs+0x40/0xd9 [bluetooth]
Oct  2 17:41:59 darkstar kernel: [  438.001422]  [<fa51cdc0>] ? hci_conn_add+0x128/0x186 [bluetooth]
Oct  2 17:41:59 darkstar kernel: [  438.001429]  [<fa51ce0f>] hci_conn_add+0x177/0x186 [bluetooth]
Oct  2 17:41:59 darkstar kernel: [  438.001437]  [<fa51cf8a>] hci_connect+0x3c/0xfb [bluetooth]
Oct  2 17:41:59 darkstar kernel: [  438.001442]  [<faa20e87>] l2cap_sock_connect+0x174/0x26b [l2cap]
Oct  2 17:41:59 darkstar kernel: [  438.001448]  [<c04c8df5>] sys_connect+0x60/0x7a
Oct  2 17:41:59 darkstar kernel: [  438.001453]  [<c024b703>] ? lock_release_non_nested+0x84/0x1de
Oct  2 17:41:59 darkstar kernel: [  438.001458]  [<c028804b>] ? might_fault+0x47/0x81
Oct  2 17:41:59 darkstar kernel: [  438.001462]  [<c028804b>] ? might_fault+0x47/0x81
Oct  2 17:41:59 darkstar kernel: [  438.001468]  [<c033361f>] ? __copy_from_user_ll+0x11/0xce
Oct  2 17:41:59 darkstar kernel: [  438.001472]  [<c04c9419>] sys_socketcall+0x82/0x17b
Oct  2 17:41:59 darkstar kernel: [  438.001477]  [<c020329d>] syscall_call+0x7/0xb

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotcp: fix TCP_DEFER_ACCEPT retrans calculation
Julian Anastasov [Mon, 19 Oct 2009 10:10:40 +0000]
tcp: fix TCP_DEFER_ACCEPT retrans calculation

Fix TCP_DEFER_ACCEPT conversion between seconds and
retransmission to match the TCP SYN-ACK retransmission periods
because the time is converted to such retransmissions. The old
algorithm selects one more retransmission in some cases. Allow
up to 255 retransmissions.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT
Julian Anastasov [Mon, 19 Oct 2009 10:03:58 +0000]
tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT

Change SYN-ACK retransmitting code for the TCP_DEFER_ACCEPT
users to not retransmit SYN-ACKs during the deferring period if
ACK from client was received. The goal is to reduce traffic
during the deferring period. When the period is finished
we continue with sending SYN-ACKs (at least one) but this time
any traffic from client will change the request to established
socket allowing application to terminate it properly.
Also, do not drop acked request if sending of SYN-ACK fails.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotcp: accept socket after TCP_DEFER_ACCEPT period
Julian Anastasov [Mon, 19 Oct 2009 10:01:56 +0000]
tcp: accept socket after TCP_DEFER_ACCEPT period

Willy Tarreau and many other folks in recent years
were concerned what happens when the TCP_DEFER_ACCEPT period
expires for clients which sent ACK packet. They prefer clients
that actively resend ACK on our SYN-ACK retransmissions to be
converted from open requests to sockets and queued to the
listener for accepting after the deferring period is finished.
Then application server can decide to wait longer for data
or to properly terminate the connection with FIN if read()
returns EAGAIN which is an indication for accepting after
the deferring period. This change still can have side effects
for applications that expect always to see data on the accepted
socket. Others can be prepared to work in both modes (with or
without TCP_DEFER_ACCEPT period) and their data processing can
ignore the read=EAGAIN notification and to allocate resources for
clients which proved to have no data to send during the deferring
period. OTOH, servers that use TCP_DEFER_ACCEPT=1 as flag (not
as a timeout) to wait for data will notice clients that didn't
send data for 3 seconds but that still resend ACKs.
Thanks to Willy Tarreau for the initial idea and to
Eric Dumazet for the review and testing the change.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoRevert "tcp: fix tcp_defer_accept to consider the timeout"
David S. Miller [Tue, 20 Oct 2009 02:12:36 +0000]
Revert "tcp: fix tcp_defer_accept to consider the timeout"

This reverts commit 6d01a026b7d3009a418326bdcf313503a314f1ea.

Julian Anastasov, Willy Tarreau and Eric Dumazet have come up
with a more correct way to deal with this.

Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoAF_UNIX: Fix deadlock on connecting to shutdown socket
Tomoki Sekiyama [Mon, 19 Oct 2009 06:17:37 +0000]
AF_UNIX: Fix deadlock on connecting to shutdown socket

I found a deadlock bug in UNIX domain socket, which makes able to DoS
attack against the local machine by non-root users.

How to reproduce:
1. Make a listening AF_UNIX/SOCK_STREAM socket with an abstruct
    namespace(*), and shutdown(2) it.
 2. Repeat connect(2)ing to the listening socket from the other sockets
    until the connection backlog is full-filled.
 3. connect(2) takes the CPU forever. If every core is taken, the
    system hangs.

PoC code: (Run as many times as cores on SMP machines.)

int main(void)
{
int ret;
int csd;
int lsd;
struct sockaddr_un sun;

/* make an abstruct name address (*) */
memset(&sun, 0, sizeof(sun));
sun.sun_family = PF_UNIX;
sprintf(&sun.sun_path[1], "%d", getpid());

/* create the listening socket and shutdown */
lsd = socket(AF_UNIX, SOCK_STREAM, 0);
bind(lsd, (struct sockaddr *)&sun, sizeof(sun));
listen(lsd, 1);
shutdown(lsd, SHUT_RDWR);

/* connect loop */
alarm(15); /* forcely exit the loop after 15 sec */
for (;;) {
csd = socket(AF_UNIX, SOCK_STREAM, 0);
ret = connect(csd, (struct sockaddr *)&sun, sizeof(sun));
if (-1 == ret) {
perror("connect()");
break;
}
puts("Connection OK");
}
return 0;
}

(*) Make sun_path[0] = 0 to use the abstruct namespace.
    If a file-based socket is used, the system doesn't deadlock because
    of context switches in the file system layer.

Why this happens:
 Error checks between unix_socket_connect() and unix_wait_for_peer() are
 inconsistent. The former calls the latter to wait until the backlog is
 processed. Despite the latter returns without doing anything when the
 socket is shutdown, the former doesn't check the shutdown state and
 just retries calling the latter forever.

Patch:
 The patch below adds shutdown check into unix_socket_connect(), so
 connect(2) to the shutdown socket will return -ECONREFUSED.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
Signed-off-by: Masanori Yoshida <masanori.yoshida.tv@hitachi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoxfrm: remove skb_icv_walk
Steffen Klassert [Wed, 7 Oct 2009 22:50:40 +0000]
xfrm: remove skb_icv_walk

The last users of skb_icv_walk are converted to ahash now,
so skb_icv_walk is unused and can be removed.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoah: Remove obsolete code
Steffen Klassert [Wed, 7 Oct 2009 22:49:57 +0000]
ah: Remove obsolete code

ah4 and ah6 are converted to ahash now, so we can remove the
code for the obsolete hash algorithm.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>