9 years agonet: sk_sleep() helper
Eric Dumazet [Tue, 20 Apr 2010 13:03:51 +0000]
net: sk_sleep() helper

Define a new function to return the waitqueue of a "struct sock".

static inline wait_queue_head_t *sk_sleep(struct sock *sk)
{
return sk->sk_sleep;
}

Change all read occurrences of sk_sleep by a call to this function.

Needed for a future RCU conversion. sk_sleep wont be a field directly
available.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: emphasize rtnl lock required in call_netdevice_notifiers
Jiri Pirko [Tue, 20 Apr 2010 08:45:37 +0000]
net: emphasize rtnl lock required in call_netdevice_notifiers

Since netdev_chain is guarded by rtnl_lock, ASSERT_RTNL should be
present here to make sure that all callers of call_netdevice_notifiers
does the locking properly.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agorps: consistent rxhash
Eric Dumazet [Mon, 19 Apr 2010 21:56:38 +0000]
rps: consistent rxhash

In case we compute a software skb->rxhash, we can generate a consistent
hash : Its value will be the same in both flow directions.

This helps some workloads, like conntracking, since the same state needs
to be accessed in both directions.

tbench + RFS + this patch gives better results than tbench with default
kernel configuration (no RPS, no RFS)

Also fixed some sparse warnings.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agorps: cleanups
Eric Dumazet [Mon, 19 Apr 2010 21:17:14 +0000]
rps: cleanups

struct softnet_data holds many queues, so consistent use "sd" name
instead of "queue" is better.

Adds a rps_ipi_queued() helper to cleanup enqueue_to_backlog()

Adds a _and_irq_disable suffix to net_rps_action() name, as David
suggested.

incr_input_queue_head() becomes input_queue_head_incr()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agorps: static functions
Eric Dumazet [Mon, 19 Apr 2010 21:40:57 +0000]
rps: static functions

store_rps_map() & store_rps_dev_flow_table_cnt() are static.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agorps: shortcut net_rps_action()
Eric Dumazet [Mon, 19 Apr 2010 05:07:33 +0000]
rps: shortcut net_rps_action()

net_rps_action() is a bit expensive on NR_CPUS=64..4096 kernels, even if
RPS is not active.

Tom Herbert used two bitmasks to hold information needed to send IPI,
but a single LIFO list seems more appropriate.

Move all RPS logic into net_rps_action() to cleanup net_rx_action() code
(remove two ifdefs)

Move rps_remote_softirq_cpus into softnet_data to share its first cache
line, filling an existing hole.

In a future patch, we could call net_rps_action() from process_backlog()
to make sure we send IPI before handling this cpu backlog.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: Date and version
Vladislav Zolotarov [Mon, 19 Apr 2010 01:15:17 +0000]
bnx2x: Date and version

Set version to 1.52.53-1.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: Don't report link down if has been already down
Vladislav Zolotarov [Mon, 19 Apr 2010 01:15:08 +0000]
bnx2x: Don't report link down if has been already down

Author: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: Rework power state handling code
Vladislav Zolotarov [Mon, 19 Apr 2010 01:14:49 +0000]
bnx2x: Rework power state handling code

Move "don't shut down the power" logic into bnx2x_set_power_state() to make the code cleaner.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: use mask in test_registers() to avoid parity error
Vladislav Zolotarov [Mon, 19 Apr 2010 01:14:37 +0000]
bnx2x: use mask in test_registers() to avoid parity error

Properly mask the value to be written to the register (according to the register size) during the self-test.
Otherwise immediate parity error would be generated.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: Fixed MSI-X enabling flow
Vladislav Zolotarov [Mon, 19 Apr 2010 01:14:18 +0000]
bnx2x: Fixed MSI-X enabling flow

Try to enable less MSI-X vectors if initial request has failed.

Author: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: Added new statistics
Vladislav Zolotarov [Mon, 19 Apr 2010 01:14:07 +0000]
bnx2x: Added new statistics

Added total_mcast/bcast_pkts_transmitted statistics.

Author: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: White spaces
Vladislav Zolotarov [Mon, 19 Apr 2010 01:13:57 +0000]
bnx2x: White spaces

White spaces, code readability and prints.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: Protect code with NOMCP
Vladislav Zolotarov [Mon, 19 Apr 2010 01:13:49 +0000]
bnx2x: Protect code with NOMCP

Don't run code that can't be run if MCP is not present.
This will prevent NULL pointer dereferencing.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: Increase DMAE max write size for 57711
Vladislav Zolotarov [Mon, 19 Apr 2010 01:13:33 +0000]
bnx2x: Increase DMAE max write size for 57711

Increase DMAE max write size for 57711 to the maximum allowed value.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: Use VPD-R V0 entry to display firmware revision
Vladislav Zolotarov [Mon, 19 Apr 2010 01:13:23 +0000]
bnx2x: Use VPD-R V0 entry to display firmware revision

Author: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobnx2x: Parity errors handling for 57710 and 57711
Vladislav Zolotarov [Mon, 19 Apr 2010 01:13:12 +0000]
bnx2x: Parity errors handling for 57710 and 57711

This patch introduces the parity errors handling code for 57710 and 57711 chips.

HW is configured to stop all DMA transactions to the host and sending packets to the network
once parity error is detected, which is meant to prevent silent data corruption.
At the same time HW generates the attention interrupt to every function of the device where parity
has been detected so that driver can start the recovery flow.

The recovery is actually resetting the chip and restarting the driver on all active functions
of the chip where the parity error has been reported.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: Introduce skb_orphan_try()
Eric Dumazet [Fri, 16 Apr 2010 12:18:22 +0000]
net: Introduce skb_orphan_try()

Transmitted skb might be attached to a socket and a destructor, for
memory accounting purposes.

Traditionally, this destructor is called at tx completion time, when skb
is freed.

When tx completion is performed by another cpu than the sender, this
forces some cache lines to change ownership. XPS was an attempt to give
tx completion to initial cpu.

David idea is to call destructor right before giving skb to device (call
to ndo_start_xmit()). Because device queues are usually small, orphaning
skb before tx completion is not a big deal. Some drivers already do
this, we could do it in upper level.

There is one known exception to this early orphaning, called tx
timestamping. It needs to keep a reference to socket until device can
give a hardware or software timestamp.

This patch adds a skb_orphan_try() helper, to centralize all exceptions
to early orphaning in one spot, and use it in dev_hard_start_xmit().

"tbench 16" results on a Nehalem machine (2 X5570  @ 2.93GHz)
before: Throughput 4428.9 MB/sec 16 procs
after: Throughput 4448.14 MB/sec 16 procs

UDP should get even better results, its destructor being more complex,
since SOCK_USE_WRITE_QUEUE is not set (four atomic ops instead of one)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: remove time limit in process_backlog()
Eric Dumazet [Sat, 17 Apr 2010 04:17:02 +0000]
net: remove time limit in process_backlog()

- There is no point to enforce a time limit in process_backlog(), since
other napi instances dont follow same rule. We can exit after only one
packet processed...
The normal quota of 64 packets per napi instance should be the norm, and
net_rx_action() already has its own time limit.
Note : /proc/net/core/dev_weight can be used to tune this 64 default
value.

- Use DEFINE_PER_CPU_ALIGNED for softnet_data definition.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agorps: rps_sock_flow_table is mostly read
Eric Dumazet [Sat, 17 Apr 2010 07:54:36 +0000]
rps: rps_sock_flow_table is mostly read

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agorfs: Receive Flow Steering
Tom Herbert [Fri, 16 Apr 2010 23:01:27 +0000]
rfs: Receive Flow Steering

This patch implements receive flow steering (RFS).  RFS steers
received packets for layer 3 and 4 processing to the CPU where
the application for the corresponding flow is running.  RFS is an
extension of Receive Packet Steering (RPS).

The basic idea of RFS is that when an application calls recvmsg
(or sendmsg) the application's running CPU is stored in a hash
table that is indexed by the connection's rxhash which is stored in
the socket structure.  The rxhash is passed in skb's received on
the connection from netif_receive_skb.  For each received packet,
the associated rxhash is used to look up the CPU in the hash table,
if a valid CPU is set then the packet is steered to that CPU using
the RPS mechanisms.

The convolution of the simple approach is that it would potentially
allow OOO packets.  If threads are thrashing around CPUs or multiple
threads are trying to read from the same sockets, a quickly changing
CPU value in the hash table could cause rampant OOO packets--
we consider this a non-starter.

To avoid OOO packets, this solution implements two types of hash
tables: rps_sock_flow_table and rps_dev_flow_table.

rps_sock_table is a global hash table.  Each entry is just a CPU
number and it is populated in recvmsg and sendmsg as described above.
This table contains the "desired" CPUs for flows.

rps_dev_flow_table is specific to each device queue.  Each entry
contains a CPU and a tail queue counter.  The CPU is the "current"
CPU for a matching flow.  The tail queue counter holds the value
of a tail queue counter for the associated CPU's backlog queue at
the time of last enqueue for a flow matching the entry.

Each backlog queue has a queue head counter which is incremented
on dequeue, and so a queue tail counter is computed as queue head
count + queue length.  When a packet is enqueued on a backlog queue,
the current value of the queue tail counter is saved in the hash
entry of the rps_dev_flow_table.

And now the trick: when selecting the CPU for RPS (get_rps_cpu)
the rps_sock_flow table and the rps_dev_flow table for the RX queue
are consulted.  When the desired CPU for the flow (found in the
rps_sock_flow table) does not match the current CPU (found in the
rps_dev_flow table), the current CPU is changed to the desired CPU
if one of the following is true:

- The current CPU is unset (equal to RPS_NO_CPU)
- Current CPU is offline
- The current CPU's queue head counter >= queue tail counter in the
rps_dev_flow table.  This checks if the queue tail has advanced
beyond the last packet that was enqueued using this table entry.
This guarantees that all packets queued using this entry have been
dequeued, thus preserving in order delivery.

Making each queue have its own rps_dev_flow table has two advantages:
1) the tail queue counters will be written on each receive, so
keeping the table local to interrupting CPU s good for locality.  2)
this allows lockless access to the table-- the CPU number and queue
tail counter need to be accessed together under mutual exclusion
from netif_receive_skb, we assume that this is only called from
device napi_poll which is non-reentrant.

This patch implements RFS for TCP and connected UDP sockets.
It should be usable for other flow oriented protocols.

There are two configuration parameters for RFS.  The
"rps_flow_entries" kernel init parameter sets the number of
entries in the rps_sock_flow_table, the per rxqueue sysfs entry
"rps_flow_cnt" contains the number of entries in the rps_dev_flow
table for the rxqueue.  Both are rounded to power of two.

The obvious benefit of RFS (over just RPS) is that it achieves
CPU locality between the receive processing for a flow and the
applications processing; this can result in increased performance
(higher pps, lower latency).

The benefits of RFS are dependent on cache hierarchy, application
load, and other factors.  On simple benchmarks, we don't necessarily
see improvement and sometimes see degradation.  However, for more
complex benchmarks and for applications where cache pressure is
much higher this technique seems to perform very well.

Below are some benchmark results which show the potential benfit of
this patch.  The netperf test has 500 instances of netperf TCP_RR
test with 1 byte req. and resp.  The RPC test is an request/response
test similar in structure to netperf RR test ith 100 threads on
each host, but does more work in userspace that netperf.

e1000e on 8 core Intel
   No RFS or RPS 104K tps at 30% CPU
   No RFS (best RPS config):    290K tps at 63% CPU
   RFS 303K tps at 61% CPU

RPC test tps CPU% 50/90/99% usec latency Latency StdDev
  No RFS/RPS 103K 48% 757/900/3185 4472.35
  RPS only: 174K 73% 415/993/2468 491.66
  RFS 223K 73% 379/651/1382 315.61

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv6: fix the comment of ip6_xmit()
Shan Wei [Thu, 15 Apr 2010 16:48:48 +0000]
ipv6: fix the comment of ip6_xmit()

ip6_xmit() is used by upper transport protocol.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: replace ipfragok with skb->local_df
Shan Wei [Thu, 15 Apr 2010 16:43:08 +0000]
net: replace ipfragok with skb->local_df

As Herbert Xu said: we should be able to simply replace ipfragok
with skb->local_df. commit f88037(sctp: Drop ipfargok in sctp_xmit function)
has droped ipfragok and set local_df value properly.

The patch kills the ipfragok parameter of .queue_xmit().

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv6: cancel to setting local_df in ip6_xmit()
Shan Wei [Thu, 15 Apr 2010 16:39:14 +0000]
ipv6: cancel to setting local_df in ip6_xmit()

commit f88037(sctp: Drop ipfargok in sctp_xmit function)
has droped ipfragok and set local_df value properly.

So the change of commit 77e2f1(ipv6: Fix ip6_xmit to
send fragments if ipfragok is true) is not needed.
So the patch remove them.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet/l2tp/l2tp_debugfs.c: Convert NIPQUAD to %pI4
Joe Perches [Thu, 15 Apr 2010 22:37:13 +0000]
net/l2tp/l2tp_debugfs.c: Convert NIPQUAD to %pI4

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville...
David S. Miller [Thu, 15 Apr 2010 21:31:06 +0000]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next-2.6

9 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/ipmr-2.6
David S. Miller [Thu, 15 Apr 2010 21:14:05 +0000]
Merge branch 'master' of git://git./linux/kernel/git/kaber/ipmr-2.6

9 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Thu, 15 Apr 2010 20:21:34 +0000]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next-2.6 into for-davem

Conflicts:
Documentation/feature-removal-schedule.txt
drivers/net/wireless/ath/ath5k/phy.c
drivers/net/wireless/wl12xx/wl1271_main.c

9 years agoipv4: ipmr: fix NULL pointer deref during unres queue destruction
Patrick McHardy [Thu, 15 Apr 2010 11:29:28 +0000]
ipv4: ipmr: fix NULL pointer deref during unres queue destruction

Fix an oversight in ipmr_destroy_unres() - the net pointer is
unconditionally initialized to NULL, resulting in a NULL pointer
dereference later on.

Fix by adding a net pointer to struct mr_table and using it in
ipmr_destroy_unres().

Signed-off-by: Patrick McHardy <kaber@trash.net>

9 years agoipv4: ipmr: fix invalid cache resolving when adding a non-matching entry
Patrick McHardy [Thu, 15 Apr 2010 11:29:28 +0000]
ipv4: ipmr: fix invalid cache resolving when adding a non-matching entry

The patch to convert struct mfc_cache to list_heads (ipv4: ipmr: convert
struct mfc_cache to struct list_head) introduced a bug when adding new
cache entries that don't match any unresolved entries.

The unres queue is searched for a matching entry, which is then resolved.
When no matching entry is present, the iterator points to the head of the
list, but is treated as a matching entry. Use a seperate variable to
indicate that a matching entry was found.

Signed-off-by: Patrick McHardy <kaber@trash.net>

9 years agoipv4: ipmr: fix IP_MROUTE_MULTIPLE_TABLES Kconfig dependencies
Patrick McHardy [Thu, 15 Apr 2010 11:29:27 +0000]
ipv4: ipmr: fix IP_MROUTE_MULTIPLE_TABLES Kconfig dependencies

IP_MROUTE_MULTIPLE_TABLES should depend on IP_MROUTE.

Signed-off-by: Patrick McHardy <kaber@trash.net>

9 years agonet: CONFIG_SMP should be CONFIG_RPS
Changli Gao [Thu, 15 Apr 2010 07:16:59 +0000]
net: CONFIG_SMP should be CONFIG_RPS

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: netif_rx() must disable preemption
Eric Dumazet [Thu, 15 Apr 2010 07:14:07 +0000]
net: netif_rx() must disable preemption

Eric Paris reported netif_rx() is calling smp_processor_id() from
preemptible context, in particular when caller is
ip_dev_loopback_xmit().

RPS commit added this smp_processor_id() call, this patch makes sure
preemption is disabled. rps_get_cpus() wants rcu_read_lock() anyway, we
can dot it a bit earlier.

Reported-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoMerge branch 'vhost' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
David S. Miller [Thu, 15 Apr 2010 05:52:46 +0000]
Merge branch 'vhost' of git://git./linux/kernel/git/mst/vhost

9 years agoixgbe: fix bug with vlan strip in promsic mode
Jesse Brandeburg [Wed, 14 Apr 2010 23:04:23 +0000]
ixgbe: fix bug with vlan strip in promsic mode

The ixgbe driver was setting up 82598 hardware correctly, so that
when promiscuous mode was enabled hardware stripping was turned
off.  But on 82599 the logic to disable/enable hardware stripping
is different, and the code was not updated correctly when the
hardware vlan stripping was enabled as default.

This change comprises the creation of two new helper functions
and calling them from the right locations to disable and enable
hardware stripping of vlan tags at appropriate times.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agodrivers: net: use skb_headlen()
Eric Dumazet [Wed, 14 Apr 2010 22:59:40 +0000]
drivers: net: use skb_headlen()

replaces (skb->len - skb->data_len) occurrences by skb_headlen(skb)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agowireless: rt2x00: rt2800usb: identify Sitecom devices
Xose Vazquez Perez [Wed, 14 Apr 2010 10:16:00 +0000]
wireless: rt2x00: rt2800usb: identify Sitecom devices

A very useful information was provided by Sitecom R&D guys:

Please find the information regarding our latest Ralink adapters below;

WL-302    -    VID: 0x0DF6,    PID: 0x002D    -    Ralink RT2771
WL-315    -    VID: 0x0DF6,    PID: 0x0039    -    Ralink RT2770
WL-319    -    VID: 0x182D,    PID: 0x0037    -    Ralink RT2860
WL-321    -    VID: 0x0DF6,    PID: 0x003B    -    Ralink RT2770
WL-324    -    VID: 0x0DF6,    PID: 0x003D    -    Ralink RT2870
WL-329    -    VID: 0x0DF6,    PID: 0x0041    -    Ralink RT3572
WL-343    -    VID: 0x0DF6,    PID: 0x003E    -    Ralink RT3070
WL-344    -    VID: 0x0DF6,    PID: 0x0040    -    Ralink RT3071
WL-345    -    VID: 0x0DF6,    PID: 0x0042    -    Ralink RT3072
WL-608    -    VID: 0x0DF6,    PID: 0x003F    -    Ralink RT2070

Note:
PID: 0x003C, 0x004A, and 0x004D:   --these products do not exist; devices were never produced/shipped--

The WL-349v4 USB dongle (0x0df6,0x0050) will be shipped soon (it isn't available yet), and uses a Ralink RT3370 chipset.

Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

9 years agoar9170usb: add a couple more USB IDs
Christian Lamparter [Tue, 13 Apr 2010 16:10:26 +0000]
ar9170usb: add a couple more USB IDs

This patch adds the following 5 entries to the usbid device table:

 * Netgear WNA1000
 * Proxim ORiNOCO Dual Band 802.11n USB Adapter
 * 3Com Dual Band 802.11n USB Adapter
 * H3C Dual Band 802.11n USB Adapter
 * WNC Generic 11n USB dongle

CC: <stable@kernel.org>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

9 years agowl1251: don't require NVS data when EEPROM is used
Grazvydas Ignotas [Mon, 12 Apr 2010 22:21:40 +0000]
wl1251: don't require NVS data when EEPROM is used

If EEPROM is used, NVS data is now loaded but ignored.
Stop loading it to avoid need of dummy NVS file for modules with EEPROM.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Acked-by: Kalle Valo <kvalo@adurom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

9 years agoath9k-htc: fix lockdep warning and kernel warning after unplugging ar9271 usb device
Ming Lei [Mon, 12 Apr 2010 16:29:27 +0000]
ath9k-htc: fix lockdep warning and kernel warning after unplugging ar9271 usb device

This patch fixes two warnings below after unplugging ar9271 usb device:
-one is a kernel warning[1]
-another is a lockdep warning[2]

The root reason is that __skb_queue_purge can't be executed in hardirq
context, so the patch forks ath9k_skb_queue_purge(ath9k version of _skb_queue_purge),
which frees skb with dev_kfree_skb_any which can be run in hardirq
context safely, then prevent the lockdep warning and kernel warning after
unplugging ar9271 usb device.

[1] kernel warning
[  602.894005] ------------[ cut here ]------------
[  602.894005] WARNING: at net/core/skbuff.c:398 skb_release_head_state+0x71/0x87()
[  602.894005] Hardware name: 6475EK2
[  602.894005] Modules linked in: ath9k_htc ath9k ath9k_common ath9k_hw ath bridge stp llc sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table kvm_intel kvm arc4 ecb mac80211 snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep thinkpad_acpi snd_pcm snd_timer hwmon iTCO_wdt snd e1000e pcspkr i2c_i801 usbhid iTCO_vendor_support wmi cfg80211 yenta_socket rsrc_nonstatic pata_acpi snd_page_alloc soundcore uhci_hcd ohci_hcd ehci_hcd usbcore i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: ath]
[  602.894005] Pid: 2506, comm: ping Tainted: G        W  2.6.34-rc3-wl #20
[  602.894005] Call Trace:
[  602.894005]  <IRQ>  [<ffffffff8104a41c>] warn_slowpath_common+0x7c/0x94
[  602.894005]  [<ffffffffa022f398>] ? __skb_queue_purge+0x43/0x4a [ath9k_htc]
[  602.894005]  [<ffffffff8104a448>] warn_slowpath_null+0x14/0x16
[  602.894005]  [<ffffffff813269c1>] skb_release_head_state+0x71/0x87
[  602.894005]  [<ffffffff8132829a>] __kfree_skb+0x16/0x81
[  602.894005]  [<ffffffff813283b2>] kfree_skb+0x7e/0x86
[  602.894005]  [<ffffffffa022f398>] __skb_queue_purge+0x43/0x4a [ath9k_htc]
[  602.894005]  [<ffffffffa022f560>] __hif_usb_tx+0x1c1/0x21b [ath9k_htc]
[  602.894005]  [<ffffffffa022f73c>] hif_usb_tx_cb+0x12f/0x154 [ath9k_htc]
[  602.894005]  [<ffffffffa00d2fbe>] usb_hcd_giveback_urb+0x91/0xc5 [usbcore]
[  602.894005]  [<ffffffffa00f6c34>] ehci_urb_done+0x7a/0x8b [ehci_hcd]
[  602.894005]  [<ffffffffa00f6f33>] qh_completions+0x2ee/0x376 [ehci_hcd]
[  602.894005]  [<ffffffffa00f8ba5>] ehci_work+0x95/0x76e [ehci_hcd]
[  602.894005]  [<ffffffffa00fa5ae>] ? ehci_irq+0x2f/0x1d4 [ehci_hcd]
[  602.894005]  [<ffffffffa00fa725>] ehci_irq+0x1a6/0x1d4 [ehci_hcd]
[  602.894005]  [<ffffffff810a6d18>] ? __rcu_process_callbacks+0x7a/0x2df
[  602.894005]  [<ffffffff810a47a4>] ? handle_fasteoi_irq+0x22/0xd2
[  602.894005]  [<ffffffffa00d268d>] usb_hcd_irq+0x4a/0xa7 [usbcore]
[  602.894005]  [<ffffffff810a2853>] handle_IRQ_event+0x77/0x14f
[  602.894005]  [<ffffffff813285ce>] ? skb_release_data+0xc9/0xce
[  602.894005]  [<ffffffff810a4814>] handle_fasteoi_irq+0x92/0xd2
[  602.894005]  [<ffffffff8100c4fb>] handle_irq+0x88/0x91
[  602.894005]  [<ffffffff8100baed>] do_IRQ+0x63/0xc9
[  602.894005]  [<ffffffff81354245>] ? ip_flush_pending_frames+0x4d/0x5c
[  602.894005]  [<ffffffff813ba993>] ret_from_intr+0x0/0x16
[  602.894005]  <EOI>  [<ffffffff811095fe>] ? __delete_object+0x5a/0xb1
[  602.894005]  [<ffffffff813ba5f5>] ? _raw_write_unlock_irqrestore+0x47/0x7e
[  602.894005]  [<ffffffff813ba5fa>] ? _raw_write_unlock_irqrestore+0x4c/0x7e
[  602.894005]  [<ffffffff811095fe>] __delete_object+0x5a/0xb1
[  602.894005]  [<ffffffff81109814>] delete_object_full+0x25/0x31
[  602.894005]  [<ffffffff813a60c0>] kmemleak_free+0x26/0x45
[  602.894005]  [<ffffffff810ff517>] kfree+0xaa/0x149
[  602.894005]  [<ffffffff81323fb7>] ? sock_def_write_space+0x84/0x89
[  602.894005]  [<ffffffff81354245>] ? ip_flush_pending_frames+0x4d/0x5c
[  602.894005]  [<ffffffff813285ce>] skb_release_data+0xc9/0xce
[  602.894005]  [<ffffffff813282a2>] __kfree_skb+0x1e/0x81
[  602.894005]  [<ffffffff813283b2>] kfree_skb+0x7e/0x86
[  602.894005]  [<ffffffff81354245>] ip_flush_pending_frames+0x4d/0x5c
[  602.894005]  [<ffffffff81370c1f>] raw_sendmsg+0x653/0x709
[  602.894005]  [<ffffffff81379e31>] inet_sendmsg+0x54/0x5d
[  602.894005]  [<ffffffff813207a2>] ? sock_recvmsg+0xc6/0xdf
[  602.894005]  [<ffffffff813208c1>] sock_sendmsg+0xc0/0xd9
[  602.894005]  [<ffffffff810e13b4>] ? might_fault+0x68/0xb8
[  602.894005]  [<ffffffff810e13fd>] ? might_fault+0xb1/0xb8
[  602.894005]  [<ffffffff8132a1c3>] ? copy_from_user+0x2f/0x31
[  602.894005]  [<ffffffff8132a5b3>] ? verify_iovec+0x54/0x91
[  602.894005]  [<ffffffff81320d41>] sys_sendmsg+0x1da/0x241
[  602.894005]  [<ffffffff8103d327>] ? finish_task_switch+0x0/0xc9
[  602.894005]  [<ffffffff8103d327>] ? finish_task_switch+0x0/0xc9
[  602.894005]  [<ffffffff8107642e>] ? trace_hardirqs_on_caller+0x16/0x150
[  602.894005]  [<ffffffff813ba27d>] ? _raw_spin_unlock_irq+0x56/0x63
[  602.894005]  [<ffffffff8103d3cb>] ? finish_task_switch+0xa4/0xc9
[  602.894005]  [<ffffffff8103d327>] ? finish_task_switch+0x0/0xc9
[  602.894005]  [<ffffffff810357fe>] ? need_resched+0x23/0x2d
[  602.894005]  [<ffffffff8107642e>] ? trace_hardirqs_on_caller+0x16/0x150
[  602.894005]  [<ffffffff813b9750>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  602.894005]  [<ffffffff81009c02>] system_call_fastpath+0x16/0x1b
[  602.894005] ---[ end trace 91ba2d8dc7826839 ]---

[2] lockdep warning
[  169.363215] ======================================================
[  169.365390] [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
[  169.366334] 2.6.34-rc3-wl #20
[  169.366872] ------------------------------------------------------
[  169.366872] khubd/78 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[  169.366872]  (clock-AF_INET){++.?..}, at: [<ffffffff81323f51>] sock_def_write_space+0x1e/0x89
[  169.366872]
[  169.366872] and this task is already holding:
[  169.366872]  (&(&hif_dev->tx.tx_lock)->rlock){-.-...}, at: [<ffffffffa03715b0>] hif_usb_stop+0x24/0x53 [ath9k_htc]
[  169.366872] which would create a new lock dependency:
[  169.366872]  (&(&hif_dev->tx.tx_lock)->rlock){-.-...} -> (clock-AF_INET){++.?..}
[  169.366872]
[  169.366872] but this new dependency connects a HARDIRQ-irq-safe lock:
[  169.366872]  (&(&hif_dev->tx.tx_lock)->rlock){-.-...}
[  169.366872] ... which became HARDIRQ-irq-safe at:
[  169.366872]   [<ffffffff810772d5>] __lock_acquire+0x2c6/0xd2b
[  169.366872]   [<ffffffff8107866d>] lock_acquire+0xec/0x119
[  169.366872]   [<ffffffff813b99bb>] _raw_spin_lock+0x40/0x73
[  169.366872]   [<ffffffffa037163d>] hif_usb_tx_cb+0x5e/0x154 [ath9k_htc]
[  169.366872]   [<ffffffffa00d2fbe>] usb_hcd_giveback_urb+0x91/0xc5 [usbcore]
[  169.366872]   [<ffffffffa00f6c34>] ehci_urb_done+0x7a/0x8b [ehci_hcd]
[  169.366872]   [<ffffffffa00f6f33>] qh_completions+0x2ee/0x376 [ehci_hcd]
[  169.366872]   [<ffffffffa00f8ba5>] ehci_work+0x95/0x76e [ehci_hcd]
[  169.366872]   [<ffffffffa00fa725>] ehci_irq+0x1a6/0x1d4 [ehci_hcd]
[  169.366872]   [<ffffffffa00d268d>] usb_hcd_irq+0x4a/0xa7 [usbcore]
[  169.366872]   [<ffffffff810a2853>] handle_IRQ_event+0x77/0x14f
[  169.366872]   [<ffffffff810a4814>] handle_fasteoi_irq+0x92/0xd2
[  169.366872]   [<ffffffff8100c4fb>] handle_irq+0x88/0x91
[  169.366872]   [<ffffffff8100baed>] do_IRQ+0x63/0xc9
[  169.366872]   [<ffffffff813ba993>] ret_from_intr+0x0/0x16
[  169.366872]   [<ffffffff8130f6ee>] cpuidle_idle_call+0xa7/0x115
[  169.366872]   [<ffffffff81008c4f>] cpu_idle+0x68/0xc4
[  169.366872]   [<ffffffff813a41e0>] rest_init+0x104/0x10b
[  169.366872]   [<ffffffff81899db3>] start_kernel+0x3f1/0x3fc
[  169.366872]   [<ffffffff818992c8>] x86_64_start_reservations+0xb3/0xb7
[  169.366872]   [<ffffffff818993c4>] x86_64_start_kernel+0xf8/0x107
[  169.366872]
[  169.366872] to a HARDIRQ-irq-unsafe lock:
[  169.366872]  (clock-AF_INET){++.?..}
[  169.366872] ... which became HARDIRQ-irq-unsafe at:
[  169.366872] ...  [<ffffffff81077349>] __lock_acquire+0x33a/0xd2b
[  169.366872]   [<ffffffff8107866d>] lock_acquire+0xec/0x119
[  169.366872]   [<ffffffff813b9d07>] _raw_write_lock_bh+0x45/0x7a
[  169.366872]   [<ffffffff8135cf14>] tcp_close+0x165/0x34d
[  169.366872]   [<ffffffff8137aced>] inet_release+0x55/0x5c
[  169.366872]   [<ffffffff81321350>] sock_release+0x1f/0x6e
[  169.366872]   [<ffffffff813213c6>] sock_close+0x27/0x2b
[  169.366872]   [<ffffffff8110dd45>] __fput+0x125/0x1ca
[  169.366872]   [<ffffffff8110de04>] fput+0x1a/0x1c
[  169.366872]   [<ffffffff8110adc9>] filp_close+0x68/0x72
[  169.366872]   [<ffffffff8110ae80>] sys_close+0xad/0xe7
[  169.366872]   [<ffffffff81009c02>] system_call_fastpath+0x16/0x1b

(Trimmed at the "other info that might help us debug this" line in
the interest of brevity... -- JWL)

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

9 years agoath9k-htc:respect usb buffer cacheline alignment in reg out path
Ming Lei [Mon, 12 Apr 2010 16:29:15 +0000]
ath9k-htc:respect usb buffer cacheline alignment in reg out path

In ath9k-htc register out path, ath9k-htc will pass skb->data into
usb hcd and usb hcd will do dma mapping and unmapping to the buffer
pointed by skb->data, so we should pass a cache-line aligned address.

This patch replace __dev_alloc_skb with alloc_skb to make skb->data
pointed to a cacheline aligned address simply since ath9k-htc does not
skb_push on the skb and pass it to mac80211, also use kfree_skb to free
the skb allocated by alloc_skb(we can use kfree_skb safely in hardirq
context since skb->destructor is NULL always in the path).

Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

9 years agoath9k-htc:respect usb buffer cacheline alignment in reg in path
Ming Lei [Mon, 12 Apr 2010 16:29:05 +0000]
ath9k-htc:respect usb buffer cacheline alignment in reg in path

In ath9k-htc register in path, ath9k-htc will pass skb->data into
usb hcd and usb hcd will do dma mapping and unmapping to the buffer
pointed by skb->data, so we should pass a cache-line aligned address.

This patch replace __dev_alloc_skb with alloc_skb to make skb->data
pointed to a cacheline aligned address simply since ath9k-htc does not
skb_push on the skb and pass it to mac80211, also use kfree_skb to free
the skb allocated by alloc_skb(we can use kfree_skb safely in hardirq
context since skb->destructor is NULL always in the path).

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

9 years agoath9k-htc:respect usb buffer cacheline alignment in ath9k_hif_usb_alloc_rx_urbs
Ming Lei [Mon, 12 Apr 2010 16:28:53 +0000]
ath9k-htc:respect usb buffer cacheline alignment in ath9k_hif_usb_alloc_rx_urbs

In ath9k_hif_usb_alloc_rx_urbs, ath9k-htc will pass skb->data into
usb hcd and usb hcd will do dma mapping and unmapping to the buffer
pointed by skb->data, so we should pass a cache-line aligned address.

This patch replace __dev_alloc_skb with alloc_skb to make skb->data
pointed to a cacheline aligned address simply since ath9k-htc does not
skb_push on the skb and pass it to mac80211, also use kfree_skb to free
the skbs allocated by alloc_skb(we can use kfree_skb safely in hardirq
context since skb->destructor is NULL always in the path).

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

9 years agoath5k: treat RXORN as non-fatal
Bruno Randolf [Mon, 12 Apr 2010 07:38:52 +0000]
ath5k: treat RXORN as non-fatal

We get RXORN interrupts when all receive buffers are full. This is not
necessarily a fatal situation. It can also happen when the bus is busy or the
CPU is not fast enough to process all frames.

Older chipsets apparently need a reset to come out of this situration, but on
newer chips we can treat RXORN like RX, as going thru a full reset does more
harm than good, there.

The exact chip revisions which need a reset are unknown - this guess
AR5K_SREV_AR5212 ("venice") is copied from the HAL.

Inspired by openwrt 413-rxorn.patch:
"treat rxorn like rx, reset after rxorn seems to do more harm than good"

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

9 years agoath5k: Use high bitrates for ACK/CTS
Bruno Randolf [Mon, 12 Apr 2010 07:38:47 +0000]
ath5k: Use high bitrates for ACK/CTS

There was a confusion in the usage of the bits AR5K_STA_ID1_ACKCTS_6MB and
AR5K_STA_ID1_BASE_RATE_11B. If they are set (1), we will get lower bitrates for
ACK and CTS. Therefore ath5k_hw_set_ack_bitrate_high(ah, false) actually
resulted in high bitrates, which i think is what we want anyways. Cleared the
confusion and added some documentation.

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

9 years agovirtio_net: Fix mis-merge.
David S. Miller [Wed, 14 Apr 2010 13:45:44 +0000]
virtio_net: Fix mis-merge.

Pointed out by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
David S. Miller [Wed, 14 Apr 2010 12:01:33 +0000]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6

Conflicts:
drivers/net/pcmcia/smc91c92_cs.c
drivers/net/virtio_net.c

9 years agotun: orphan an skb on tx
Michael S. Tsirkin [Tue, 13 Apr 2010 04:59:44 +0000]
tun: orphan an skb on tx

The following situation was observed in the field:
tap1 sends packets, tap2 does not consume them, as a result
tap1 can not be closed. This happens because
tun/tap devices can hang on to skbs undefinitely.

As noted by Herbert, possible solutions include a timeout followed by a
copy/change of ownership of the skb, or always copying/changing
ownership if we're going into a hostile device.

This patch implements the second approach.

Note: one issue still remaining is that since skbs
keep reference to tun socket and tun socket has a
reference to tun device, we won't flush backlog,
instead simply waiting for all skbs to get transmitted.
At least this is not user-triggerable, and
this was not reported in practice, my assumption is
other devices besides tap complete an skb
within finite time after it has been queued.

A possible solution for the second issue
would not to have socket reference the device,
instead, implement dev->destructor for tun, and
wait for all skbs to complete there, but this
needs some thought, probably too risky for 2.6.34.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Yan Vugenfirer <yvugenfi@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agostmmac: updated the drv module version
Giuseppe CAVALLARO [Tue, 13 Apr 2010 20:21:17 +0000]
stmmac: updated the drv module version

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agostmmac: fix vlan support setup
Giuseppe CAVALLARO [Tue, 13 Apr 2010 20:21:16 +0000]
stmmac: fix vlan support setup

Moved STMMAC_VLAN_TAG_USED from stmmac.h to common.h header
because it is used within the device and descriptor cores.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agostmmac: get the descriptor structure from platform
Giuseppe CAVALLARO [Tue, 13 Apr 2010 20:21:15 +0000]
stmmac: get the descriptor structure from platform

Output for chip that uses the Enhanced descriptors:
[snip]
STMMAC driver:
platform registration... done!
DWMAC1000 - user ID: 0x10, Synopsys ID: 0x33
Enhanced descriptor structure
no valid MAC address;please, use ifconfig or nwhwconfig!
eth0 - (dev. name: stmmaceth - id: 0, IRQ #134
IO base addr: 0xfd110000)
STMMAC MII Bus: probed
[snip]

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agostmmac: new descriptor field for the driver's platform
Giuseppe CAVALLARO [Tue, 13 Apr 2010 20:21:14 +0000]
stmmac: new descriptor field for the driver's platform

The new enh_desc is used for selecting the enhanced descriptors
structure. There are several scenarios; some chips (mac10/100
or gmac) want to use the enhanced descriptors; others want the normal
ones.
For example, on ST platforms: MAC10/100 uses the normal desc structure
and the GMAC uses the enhanced one.
It can be useful to get this information from the platform.
This could also be decided at run-time looking at the chip's ID number;
but it could happen that chips with the same ID want to use different
descriptor structure.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agostmmac: fix Transmit FIFO flush operation
Giuseppe CAVALLARO [Tue, 13 Apr 2010 20:21:13 +0000]
stmmac: fix Transmit FIFO flush operation

Fix the Transmit FIFO flush operation; it was
disabled while reworking the descriptor structures.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agostmmac: rework normal and enhanced descriptors
Giuseppe CAVALLARO [Tue, 13 Apr 2010 20:21:12 +0000]
stmmac: rework normal and enhanced descriptors

Currently the driver assumes that the mac10/100 can only use the
normal descriptor structure and the gmac can only use the
enhanced structures.
This patch removes the descriptor's code from the dma files
and adds two new files just for handling the normal and enhanced
descriptors.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agostmmac: split core and dma for the mac10/100
Giuseppe CAVALLARO [Tue, 13 Apr 2010 20:21:11 +0000]
stmmac: split core and dma for the mac10/100

The patch splits core and dma parts for the mac10/100 device.
This was already done for the GMAC device.
It should make more flexible the driver to support other chips.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agovhost: fix sparse warnings
Christoph Hellwig [Tue, 13 Apr 2010 18:11:25 +0000]
vhost: fix sparse warnings

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

9 years agoforcedeth: fix tx limit2 flag check
Ayaz Abdulla [Wed, 14 Apr 2010 01:49:51 +0000]
forcedeth: fix tx limit2 flag check

This is a fix for bug 572201 @ bugs.debian.org

This patch fixes the TX_LIMIT feature flag. The previous logic check
for TX_LIMIT2 also took into account a device that only had TX_LIMIT
set.

Reported-by: Stephen Mulcahu <stephen.mulcahy@deri.org>
Reported-by: Ben Huchings <ben@decadent.org.uk>
Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv4: ipmr: support multiple tables
Patrick McHardy [Tue, 13 Apr 2010 05:03:23 +0000]
ipv4: ipmr: support multiple tables

This patch adds support for multiple independant multicast routing instances,
named "tables".

Userspace multicast routing daemons can bind to a specific table instance by
issuing a setsockopt call using a new option MRT_TABLE. The table number is
stored in the raw socket data and affects all following ipmr setsockopt(),
getsockopt() and ioctl() calls. By default, a single table (RT_TABLE_DEFAULT)
is created with a default routing rule pointing to it. Newly created pimreg
devices have the table number appended ("pimregX"), with the exception of
devices created in the default table, which are named just "pimreg" for
compatibility reasons.

Packets are directed to a specific table instance using routing rules,
similar to how regular routing rules work. Currently iif, oif and mark
are supported as keys, source and destination addresses could be supported
additionally.

Example usage:

- bind pimd/xorp/... to a specific table:

uint32_t table = 123;
setsockopt(fd, IPPROTO_IP, MRT_TABLE, &table, sizeof(table));

- create routing rules directing packets to the new table:

# ip mrule add iif eth0 lookup 123
# ip mrule add oif eth0 lookup 123

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv4: ipmr: move mroute data into seperate structure
Patrick McHardy [Tue, 13 Apr 2010 05:03:22 +0000]
ipv4: ipmr: move mroute data into seperate structure

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv4: ipmr: convert struct mfc_cache to struct list_head
Patrick McHardy [Tue, 13 Apr 2010 05:03:21 +0000]
ipv4: ipmr: convert struct mfc_cache to struct list_head

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv4: ipmr: remove net pointer from struct mfc_cache
Patrick McHardy [Tue, 13 Apr 2010 05:03:20 +0000]
ipv4: ipmr: remove net pointer from struct mfc_cache

Now that cache entries in unres_queue don't need to be distinguished by their
network namespace pointer anymore, we can remove it from struct mfc_cache
add pass the namespace as function argument to the functions that need it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv4: ipmr: move unres_queue and timer to per-namespace data
Patrick McHardy [Tue, 13 Apr 2010 05:03:19 +0000]
ipv4: ipmr: move unres_queue and timer to per-namespace data

The unres_queue is currently shared between all namespaces. Following patches
will additionally allow to create multiple multicast routing tables in each
namespace. Having a single shared queue for all these users seems to excessive,
move the queue and the cleanup timer to the per-namespace data to unshare it.

As a side-effect, this fixes a bug in the seq file iteration functions: the
first entry returned is always from the current namespace, entries returned
after that may belong to any namespace.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv4: raw: move struct raw_sock and raw_sk() to include/net/raw.h
Patrick McHardy [Tue, 13 Apr 2010 05:03:18 +0000]
ipv4: raw: move struct raw_sock and raw_sk() to include/net/raw.h

A following patch will use struct raw_sock to store state for ipmr,
so having the definitions in icmp.h doesn't fit very well anymore.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: fib_rules: decouple address families from real address families
Patrick McHardy [Tue, 13 Apr 2010 05:03:17 +0000]
net: fib_rules: decouple address families from real address families

Decouple the address family values used for fib_rules from the real
address families in socket.h. This allows to use fib_rules for
code that is not a real address family without increasing AF_MAX/NPROTO.

Values up to 127 are reserved for real address families and map directly
to the corresponding AF value, values starting from 128 are for other
uses. rtnetlink is changed to invoke the AF_UNSPEC dumpit/doit handlers
for these families.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: fib_rules: set family in fib_rule_hdr centrally
Patrick McHardy [Tue, 13 Apr 2010 05:03:16 +0000]
net: fib_rules: set family in fib_rule_hdr centrally

All fib_rules implementations need to set the family in their ->fill()
functions. Since the value is available to the generic fib_nl_fill_rule()
function, set it there.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: fib_rules: consolidate IPv4 and DECnet ->default_pref() functions.
Patrick McHardy [Tue, 13 Apr 2010 05:03:15 +0000]
net: fib_rules: consolidate IPv4 and DECnet ->default_pref() functions.

Both functions are equivalent, consolidate them since a following patch
needs a third implementation for multicast routing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agodst: don't inline dst_ifdown
stephen hemminger [Mon, 12 Apr 2010 07:38:05 +0000]
dst: don't inline dst_ifdown

The function dst_ifdown is called only two places but in a non-
performance critical code path, there is no reason to inline it.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobe2net: clarify promiscuous cmd with a comment
Sathya Perla [Sun, 11 Apr 2010 22:35:27 +0000]
be2net: clarify promiscuous cmd with a comment

The promiscous cmd config code gives an impression that
setting a port to promisc mode will unset the other port.
This is not the case and is clarified with a comment.

Signed-off-by: Sathya Perla <sathyap@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agodrivers: net: last_rx elimination
Eric Dumazet [Sat, 10 Apr 2010 22:48:14 +0000]
drivers: net: last_rx elimination

Network drivers do not have to update last_rx, unless they need it for
their private use.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: uninline skb_bond_should_drop()
Eric Dumazet [Sun, 11 Apr 2010 06:56:11 +0000]
net: uninline skb_bond_should_drop()

skb_bond_should_drop() is too big to be inlined.

This patch reduces kernel text size, and its compilation time as well
(shrinking include/linux/netdevice.h)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoFix some #includes in CAN drivers (rebased for net-next-2.6)
Hans J. Koch [Tue, 13 Apr 2010 00:03:25 +0000]
Fix some #includes in CAN drivers (rebased for net-next-2.6)

In the current implementation, CAN drivers need to #include <linux/can.h>
_before_ they #include <linux/can/dev.h>, which is both ugly and
unnecessary.

Fix this by including <linux/can.h> in <linux/can/dev.h> and remove the
#include <linux/can.h> lines from drivers.

Signed-off-by: Hans J. Koch <hjk@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agobcm63xx_enet: do not overwrite ENET_CTL_REG value
Florian Fainelli [Fri, 9 Apr 2010 01:04:52 +0000]
bcm63xx_enet: do not overwrite ENET_CTL_REG value

bcm_enet_hw_preinit will correctly set values in ENET_CTL_REG for internal
or external MII operations, however, bcm_enet_open will blindly overwrite the
ENET_CTL_REG register value and thus we will loose any changes to it that
were made in bcm_enet_hw_preinit, rendering external MII operations non-working.

This would lead to the driver not being able to check for link availability on
external PHY setups, and thus we would never get to sending packets because
link was down from the driver side.

This was completely un-noticed because all boards out there but BCM6338-based
ones use internal phy on their enet0 interface.

Signed-off-by: Florian Fainelli <ffainelli@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoPHY: fix typo in bcm63xx PHY driver table
Florian Fainelli [Fri, 9 Apr 2010 01:04:45 +0000]
PHY: fix typo in bcm63xx PHY driver table

Signed-off-by: Florian Fainelli <ffainelli@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agochelsio: Fix build warning.
David S. Miller [Tue, 13 Apr 2010 10:07:17 +0000]
chelsio: Fix build warning.

GCC warns that:

drivers/net/chelsio/sge.c:463:11: warning: operation on 's->port' may be undefined

Better to eliminate the side effects in the calculation and
express what was intended here.

Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agosmc91c92_cs: define multicast_table as unsigned char
Ken Kawasaki [Sat, 10 Apr 2010 12:50:14 +0000]
smc91c92_cs: define multicast_table as unsigned char

smc91c92_cs:
  * define multicast_table as unsigned char
  * remove unnecessary "#ifndef final_version"

Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agocan: avoids a false warning
Eric Dumazet [Fri, 9 Apr 2010 23:47:31 +0000]
can: avoids a false warning

At this point optlen == sizeof(sfilter) but some compilers are dumb.

Reported-by: Németh Márton <nm127@freemail.h
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoe1000e: stop cleaning when we reach tx_ring->next_to_use
Terry Loftin [Fri, 9 Apr 2010 10:29:49 +0000]
e1000e: stop cleaning when we reach tx_ring->next_to_use

Tx ring buffers after tx_ring->next_to_use are volatile and could
change, possibly causing a crash.  Stop cleaning when we hit
tx_ring->next_to_use.

Signed-off-by: Terry Loftin <terry.loftin@hp.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoigb: restrict WoL for 82576 ET2 Quad Port Server Adapter
Stefan Assmann [Fri, 9 Apr 2010 09:51:34 +0000]
igb: restrict WoL for 82576 ET2 Quad Port Server Adapter

Restrict Wake-on-LAN to first port on 82576 ET2 quad port NICs, as it is
only supported there.

Signed-off-by: Stefan Assmann <sassmann@redhat.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoe1000e: use static params to save stack space
Jesse Brandeburg [Fri, 9 Apr 2010 10:51:09 +0000]
e1000e: use static params to save stack space

used a modified checkstack to get the 56 number
(normally checkstack wouldn't show this low a value)

checkstack before:
0x0000012f e1000e_check_options [e1000e]:               272

after:
0x0000012f e1000e_check_options [e1000e]:                56

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoskbuff: remove unused dev_consume_skb macro definition
Alexander Duyck [Fri, 9 Apr 2010 10:01:37 +0000]
skbuff: remove unused dev_consume_skb macro definition

dev_consume_skb and kfree_skb_clean have no users and in the case of
kfree_skb_clean could cause potential build issues since I cannot find
where it is defined.  Based on the patch in which it was introduced it
appears to have been a bit of leftover code from an earlier version of the
patch in which kfree_skb_clean was dropped in favor of consume_skb.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoigb: modify register test for i350 to reflect read only bits in RDLEN/TDLEN
Alexander Duyck [Fri, 9 Apr 2010 09:53:08 +0000]
igb: modify register test for i350 to reflect read only bits in RDLEN/TDLEN

The registers for RDLEN/TDLEN on i350 have the first 7 bits as read only.
This is a change from previous hardware in which it was only the first 4
bits that were read only.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agomyri10ge: use the DMA state API instead of the pci equivalents
FUJITA Tomonori [Mon, 12 Apr 2010 14:32:10 +0000]
myri10ge: use the DMA state API instead of the pci equivalents

This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Andrew Gallatin <gallatin@myri.com>
Cc: Brice Goglin <brice@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoqlge: use the DMA state API instead of the pci equivalents
FUJITA Tomonori [Mon, 12 Apr 2010 14:32:14 +0000]
qlge: use the DMA state API instead of the pci equivalents

This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agocxgb3: use the DMA state API instead of the pci equivalents
FUJITA Tomonori [Mon, 12 Apr 2010 14:32:12 +0000]
cxgb3: use the DMA state API instead of the pci equivalents

This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agochelsio: use the DMA state API instead of the pci equivalents
FUJITA Tomonori [Mon, 12 Apr 2010 14:32:11 +0000]
chelsio: use the DMA state API instead of the pci equivalents

This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoqla3xxx: use the DMA state API instead of the pci equivalents
FUJITA Tomonori [Mon, 12 Apr 2010 14:32:13 +0000]
qla3xxx: use the DMA state API instead of the pci equivalents

This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotg3: use the DMA state API instead of the pci equivalents
FUJITA Tomonori [Mon, 12 Apr 2010 14:32:09 +0000]
tg3: use the DMA state API instead of the pci equivalents

This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoIPv6: only notify protocols if address is compeletely gone
stephen hemminger [Mon, 12 Apr 2010 05:41:34 +0000]
IPv6: only notify protocols if address is compeletely gone

The notifier for address down should only be called if address is completely
gone, not just being marked as tentative on link transistion. The code
in net-next would case bonding/sctp/s390 to see address disappear on link
down, but they would never see it reappear on link up.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoipv6: additional ref count for hash list unnecessary
stephen hemminger [Mon, 12 Apr 2010 05:41:33 +0000]
ipv6: additional ref count for hash list unnecessary

Since an address in hash list has to already have a ref count,
no additional ref count is needed.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoIPv6: keep tentative addresses in hash table
stephen hemminger [Mon, 12 Apr 2010 05:41:32 +0000]
IPv6: keep tentative addresses in hash table

When link goes down, want address to be preserved but in a tentative
state, therefore it has to stay in hash list.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agoIPv6: keep route for tentative address
stephen hemminger [Mon, 12 Apr 2010 05:41:31 +0000]
IPv6: keep route for tentative address

Recent changes preserve IPv6 address when link goes down (good).
But would cause address to point to dead dst entry (bad).
The simplest fix is to just not delete route if address is
being held for later use.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotg3: Update version to 3.110
Matt Carlson [Mon, 12 Apr 2010 06:58:31 +0000]
tg3: Update version to 3.110

This patch updates the tg3 version to 3.110.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotg3: Remove function errors flagged by checkpatch
Matt Carlson [Mon, 12 Apr 2010 06:58:30 +0000]
tg3: Remove function errors flagged by checkpatch

This patch removes the following checkpatch errors:

* return is not a function, parentheses are not required
* space prohibited between function name and open parenthesis '('

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotg3: Unify max pkt size preprocessor constants
Matt Carlson [Mon, 12 Apr 2010 06:58:29 +0000]
tg3: Unify max pkt size preprocessor constants

The maximum packet size that gets programmed into the standard producer
ring control block is directly related to the packet size used to
allocate packet buffers.  This patch removes the redundant preprocessor
constant.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotg3: Re-inline VLAN tags when appropriate
Matt Carlson [Mon, 12 Apr 2010 06:58:28 +0000]
tg3: Re-inline VLAN tags when appropriate

The tg3 driver is written so that VLAN tagged packets can be accepted,
even if CONFIG_VLAN_8021Q or CONFIG_VLAN_8021Q_MODULE is not defined.
(Think raw interfaces.)  If the device has ASF support enabled, the
firmware requires the driver to enable VLAN tag stripping.  If VLAN
tagging is not explicitly supported by the kernel and ASF is enabled,
the driver will have to reinject the VLAN tag back into the packet
stream.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotg3: Optimize rx double copy test
Matt Carlson [Mon, 12 Apr 2010 06:58:27 +0000]
tg3: Optimize rx double copy test

On a PCIX bus, the 5701 has a bug which requires the driver to double
copy all rx packets.  The rx code uses the rx_offset device member as a
flag to determine if this workaround should take effect.  The following
patch will modify the rx_offset member such that this test will become
less clear.

The patch starts by integrating the workaround check into the packet
length check.  It rounds out the implementation by relaxing the
workaround restrictions if the platform has efficient unaligned
accesses.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotg3: Reduce 57765 core clock when link at 10Mbps
Matt Carlson [Mon, 12 Apr 2010 06:58:26 +0000]
tg3: Reduce 57765 core clock when link at 10Mbps

This patch reduces the core clock to 6.25MHz when operating at 10Mbps
link speed.  This is needed to prevent a bug that will ultimately cause
transmits to cease.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotg3: Set card 57765 card reader MRRS to 1024B
Matt Carlson [Mon, 12 Apr 2010 06:58:25 +0000]
tg3: Set card 57765 card reader MRRS to 1024B

This patch sets the Maximum Read Request Size for the card reader
function to 1024 bytes to prevent an SD controller lockup.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agotg3: Disable CLKREQ in L2
Matt Carlson [Mon, 12 Apr 2010 06:58:24 +0000]
tg3: Disable CLKREQ in L2

This patch disables CLKREQ in L2 to workaround a chipset bug.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9 years agonet: sk_dst_cache RCUification
Eric Dumazet [Thu, 8 Apr 2010 23:03:29 +0000]
net: sk_dst_cache RCUification

With latest CONFIG_PROVE_RCU stuff, I felt more comfortable to make this
work.

sk->sk_dst_cache is currently protected by a rwlock (sk_dst_lock)

This rwlock is readlocked for a very small amount of time, and dst
entries are already freed after RCU grace period. This calls for RCU
again :)

This patch converts sk_dst_lock to a spinlock, and use RCU for readers.

__sk_dst_get() is supposed to be called with rcu_read_lock() or if
socket locked by user, so use appropriate rcu_dereference_check()
condition (rcu_read_lock_held() || sock_owned_by_user(sk))

This patch avoids two atomic ops per tx packet on UDP connected sockets,
for example, and permits sk_dst_lock to be much less dirtied.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>