Herbert Xu [Fri, 9 Jun 2006 23:11:27 +0000 (16:11 -0700)]
[NET] ppp: Remove unnecessary pskb_may_pull
In ppp_receive_nonmp_frame, we call pskb_may_pull(skb, skb->len) if the
tailroom is >= 124. This is pointless because this pskb_may_pull is only
needed if the skb is non-linear. However, if it is non-linear then the
tailroom would be zero.
So it can be safely removed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 9 Jun 2006 23:10:40 +0000 (16:10 -0700)]
[NET]: Clean up skb_linearize
The linearisation operation doesn't need to be super-optimised. So we can
replace __skb_linearize with __pskb_pull_tail which does the same thing but
is more general.
Also, most users of skb_linearize end up testing whether the skb is linear
or not so it helps to make skb_linearize do just that.
Some callers of skb_linearize also use it to copy cloned data, so it's
useful to have a new function skb_linearize_cow to copy the data if it's
either non-linear or cloned.
Last but not least, I've removed the gfp argument since nobody uses it
anymore. If it's ever needed we can easily add it back.
Misc bugs fixed by this patch:
* via-velocity error handling (also, no SG => no frags)
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 9 Jun 2006 19:20:56 +0000 (12:20 -0700)]
[NET]: Add netif_tx_lock
Various drivers use xmit_lock internally to synchronise with their
transmission routines. They do so without setting xmit_lock_owner.
This is fine as long as netpoll is not in use.
With netpoll it is possible for deadlocks to occur if xmit_lock_owner
isn't set. This is because if a printk occurs while xmit_lock is held
and xmit_lock_owner is not set can cause netpoll to attempt to take
xmit_lock recursively.
While it is possible to resolve this by getting netpoll to use
trylock, it is suboptimal because netpoll's sole objective is to
maximise the chance of getting the printk out on the wire. So
delaying or dropping the message is to be avoided as much as possible.
So the only alternative is to always set xmit_lock_owner. The
following patch does this by introducing the netif_tx_lock family of
functions that take care of setting/unsetting xmit_lock_owner.
I renamed xmit_lock to _xmit_lock to indicate that it should not be
used directly. I didn't provide irq versions of the netif_tx_lock
functions since xmit_lock is meant to be a BH-disabling lock.
This is pretty much a straight text substitution except for a small
bug fix in winbond. It currently uses
netif_stop_queue/spin_unlock_wait to stop transmission. This is
unsafe as an IRQ can potentially wake up the queue. So it is safer to
use netif_tx_disable.
The hamradio bits used spin_lock_irq but it is unnecessary as
xmit_lock must never be taken in an IRQ handler.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
James Morris [Fri, 9 Jun 2006 07:33:33 +0000 (00:33 -0700)]
[SECMARK]: Add new packet controls to SELinux
Add new per-packet access controls to SELinux, replacing the old
packet controls.
Packets are labeled with the iptables SECMARK and CONNSECMARK targets,
then security policy for the packets is enforced with these controls.
To allow for a smooth transition to the new controls, the old code is
still present, but not active by default. To restore previous
behavior, the old controls may be activated at runtime by writing a
'1' to /selinux/compat_net, and also via the kernel boot parameter
selinux_compat_net. Switching between the network control models
requires the security load_policy permission. The old controls will
probably eventually be removed and any continued use is discouraged.
With this patch, the new secmark controls for SElinux are disabled by
default, so existing behavior is entirely preserved, and the user is
not affected at all.
It also provides a config option to enable the secmark controls by
default (which can always be overridden at boot and runtime). It is
also noted in the kconfig help that the user will need updated
userspace if enabling secmark controls for SELinux and that they'll
probably need the SECMARK and CONNMARK targets, and conntrack protocol
helpers, although such decisions are beyond the scope of kernel
configuration.
Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
James Morris [Fri, 9 Jun 2006 07:32:39 +0000 (00:32 -0700)]
[SECMARK]: Add CONNSECMARK xtables target
Add a new xtables target, CONNSECMARK, which is used to specify rules
for copying security marks from packets to connections, and for
copyying security marks back from connections to packets. This is
similar to the CONNMARK target, but is more limited in scope in that
it only allows copying of security marks to and from packets, as this
is all it needs to do.
A typical scenario would be to apply a security mark to a 'new' packet
with SECMARK, then copy that to its conntrack via CONNMARK, and then
restore the security mark from the connection to established and
related packets on that connection.
Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
James Morris [Fri, 9 Jun 2006 07:31:46 +0000 (00:31 -0700)]
[SECMARK]: Add secmark support to conntrack
Add a secmark field to IP and NF conntracks, so that security markings
on packets can be copied to their associated connections, and also
copied back to packets as required. This is similar to the network
mark field currently used with conntrack, although it is intended for
enforcement of security policy rather than network policy.
Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
James Morris [Fri, 9 Jun 2006 07:29:17 +0000 (00:29 -0700)]
[SECMARK]: Add secmark support to core networking.
Add a secmark field to the skbuff structure, to allow security subsystems to
place security markings on network packets. This is similar to the nfmark
field, except is intended for implementing security policy, rather than than
networking policy.
This patch was already acked in principle by Dave Miller.
Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
James Morris [Fri, 9 Jun 2006 07:27:28 +0000 (00:27 -0700)]
[SECMARK]: Add new flask definitions to SELinux
Secmark implements a new scheme for adding security markings to
packets via iptables, as well as changes to SELinux to use these
markings for security policy enforcement. The rationale for this
scheme is explained and discussed in detail in the original threads:
Examples of policy and rulesets, as well as a full archive of patches
for iptables and SELinux userland, may be found at:
http://people.redhat.com/jmorris/selinux/secmark/
The code has been tested with various compilation options and in
several scenarios, including with 'complicated' protocols such as FTP
and also with the new generic conntrack code with IPv6 connection
tracking.
This patch:
Add support for a new object class ('packet'), and associated
permissions ('send', 'recv', 'relabelto'). These are used to enforce
security policy for network packets labeled with SECMARK, and for
adding labeling rules.
Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[SELINUX]: add security class for appletalk sockets
Add a security class for appletalk sockets so that they can be
distinguished in SELinux policy. Please apply.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Catherine Zhang [Fri, 9 Jun 2006 06:39:49 +0000 (23:39 -0700)]
[LSM-IPsec]: SELinux Authorize
This patch contains a fix for the previous patch that adds security
contexts to IPsec policies and security associations. In the previous
patch, no authorization (besides the check for write permissions to
SAD and SPD) is required to delete IPsec policies and security
assocations with security contexts. Thus a user authorized to change
SAD and SPD can bypass the IPsec policy authorization by simply
deleteing policies with security contexts. To fix this security hole,
an additional authorization check is added for removing security
policies and security associations with security contexts.
Note that if no security context is supplied on add or present on
policy to be deleted, the SELinux module allows the change
unconditionally. The hook is called on deletion when no context is
present, which we may want to change. At present, I left it up to the
module.
LSM changes:
The patch adds two new LSM hooks: xfrm_policy_delete and
xfrm_state_delete. The new hooks are necessary to authorize deletion
of IPsec policies that have security contexts. The existing hooks
xfrm_policy_free and xfrm_state_free lack the context to do the
authorization, so I decided to split authorization of deletion and
memory management of security data, as is typical in the LSM
interface.
Use:
The new delete hooks are checked when xfrm_policy or xfrm_state are
deleted by either the xfrm_user interface (xfrm_get_policy,
xfrm_del_sa) or the pfkey interface (pfkey_spddelete, pfkey_delete).
SELinux changes:
The new policy_delete and state_delete functions are added.
Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com> Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Schwab [Tue, 6 Jun 2006 04:21:57 +0000 (21:21 -0700)]
[CONNECTOR]: Fix warning in cn_queue.c
cn_queue.c:130: warning: value computed is not used
There is no point in testing the atomic value if the result is thrown
away.
From Evgeniy:
It was created to put implicit smp barrier, but it is not needed there.
Signed-off-by: Andreas Schwab <schwab@suse.de> Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 6 Jun 2006 00:59:20 +0000 (17:59 -0700)]
[TCP]: Fix compile warning in tcp_probe.c
The suseconds_t et al. are not necessarily any particular type on
every platform, so cast to unsigned long so that we can use one printf
format string and avoid warnings across the board
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a new module for tracking TCP state variables non-intrusively
using kprobes. It has a simple /proc interface that outputs one line
for each packet received. A sample usage is to collect congestion
window and ssthresh over time graphs.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Many of the TCP congestion methods all just use ssthresh
as the minimum congestion window on decrease. Rather than
duplicating the code, just have that be the default if that
handle in the ops structure is not set.
Minor behaviour change to TCP compound. It probably wants
to use this (ssthresh) as lower bound, rather than ssthresh/2
because the latter causes undershoot on loss.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The original code did a 64 bit divide directly, which won't work on
32 bit platforms. Rather than doing a 64 bit square root twice,
just implement a 4th root function in one pass using Newton's method.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
TCP Compound is a sender-side only change to TCP that uses
a mixed Reno/Vegas approach to calculate the cwnd.
For further details look here:
ftp://ftp.research.microsoft.com/pub/tr/TR-2005-86.pdf
Signed-off-by: Angelo P. Castellani <angelo.castellani@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Bin Zhou [Tue, 6 Jun 2006 00:28:30 +0000 (17:28 -0700)]
[TCP]: TCP Veno congestion control
TCP Veno module is a new congestion control module to improve TCP
performance over wireless networks. The key innovation in TCP Veno is
the enhancement of TCP Reno/Sack congestion control algorithm by using
the estimated state of a connection based on TCP Vegas. This scheme
significantly reduces "blind" reduction of TCP window regardless of
the cause of packet loss.
This work is based on the research paper "TCP Veno: TCP Enhancement
for Transmission over Wireless Access Networks." C. P. Fu, S. C. Liew,
IEEE Journal on Selected Areas in Communication, Feb. 2003.
Original paper and many latest research works on veno:
http://www.ntu.edu.sg/home/ascpfu/veno/veno.html
Signed-off-by: Bin Zhou <zhou0022@ntu.edu.sg>
Cheng Peng Fu <ascpfu@ntu.edu.sg> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
TCP Low Priority is a distributed algorithm whose goal is to utilize only
the excess network bandwidth as compared to the ``fair share`` of
bandwidth as targeted by TCP. Available from:
http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf
Original Author:
Aleksandar Kuzmanovic <akuzma@northwestern.edu>
See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
As of 2.6.13, Linux supports pluggable congestion control algorithms.
Due to the limitation of the API, we take the following changes from
the original TCP-LP implementation:
o We use newReno in most core CA handling. Only add some checking
within cong_avoid.
o Error correcting in remote HZ, therefore remote HZ will be keeped
on checking and updating.
o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
OWD have a similar meaning as RTT. Also correct the buggy formular.
o Handle reaction for Early Congestion Indication (ECI) within
pkts_acked, as mentioned within pseudo code.
o OWD is handled in relative format, where local time stamp will in
tcp_time_stamp format.
Port from 2.4.19 to 2.6.16 as module by:
Wong Hoi Sing Edison <hswong3i@gmail.com>
Hung Hing Lun <hlhung3i@gmail.com>
Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Tue, 30 May 2006 01:27:32 +0000 (18:27 -0700)]
[NETFILTER]: PPTP helper: fixup gre_keymap_lookup() return type
GRE keys are 16-bit wide.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 30 May 2006 01:27:09 +0000 (18:27 -0700)]
[NETFILTER]: Add SIP connection tracking helper
Add SIP connection tracking helper. Originally written by
Christian Hentschel <chentschel@arnet.com.ar>, some cleanup, minor
fixes and bidirectional SIP support added by myself.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 30 May 2006 01:26:47 +0000 (18:26 -0700)]
[NETFILTER]: H.323 helper: replace internal_net_addr parameter by routing-based heuristic
Call Forwarding doesn't need to create an expectation if both peers can
reach each other without our help. The internal_net_addr parameter
lets the user explicitly specify a single network where this is true,
but is not very flexible and even fails in the common case that calls
will both be forwarded to outside parties and inside parties. Use an
optional heuristic based on routing instead, the assumption is that
if bpth the outgoing device and the gateway are equal, both peers can
reach each other directly.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jing Min Zhao [Tue, 30 May 2006 01:26:27 +0000 (18:26 -0700)]
[NETFILTER]: H.323 helper: Add support for Call Forwarding
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 30 May 2006 01:25:58 +0000 (18:25 -0700)]
[NETFILTER]: amanda helper: convert to textsearch infrastructure
When a port number within a packet is replaced by a differently sized
number only the packet is resized, but not the copy of the data.
Following port numbers are rewritten based on their offsets within
the copy, leading to packet corruption.
Convert the amanda helper to the textsearch infrastructure to avoid
the copy entirely.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sun, 28 May 2006 06:06:13 +0000 (23:06 -0700)]
[IPSEC] proto: Move transport mode input path into xfrm_mode_transport
Now that we have xfrm_mode objects we can move the transport mode specific
input decapsulation code into xfrm_mode_transport. This removes duplicate
code as well as unnecessary header movement in case of tunnel mode SAs
since we will discard the original IP header immediately.
This also fixes a minor bug for transport-mode ESP where the IP payload
length is set to the correct value minus the header length (with extension
headers for IPv6).
Of course the other neat thing is that we no longer have to allocate
temporary buffers to hold the IP headers for ESP and IPComp.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sun, 28 May 2006 06:03:58 +0000 (23:03 -0700)]
[IPSEC] xfrm: Undo afinfo lock proliferation
The number of locks used to manage afinfo structures can easily be reduced
down to one each for policy and state respectively. This is based on the
observation that the write locks are only held by module insertion/removal
which are very rare events so there is no need to further differentiate
between the insertion of modules like ipv6 versus esp6.
The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look
suspicious at first. However, after you realise that nobody ever takes
the corresponding write lock you'll feel better :)
As far as I can gather it's an attempt to guard against the removal of
the corresponding modules. Since neither module can be unloaded at all
we can leave it to whoever fixes up IPv6 unloading :)
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sat, 27 May 2006 00:48:07 +0000 (17:48 -0700)]
[TG3]: Add recovery logic when MMIOs are re-ordered
Add recovery logic when we suspect that the system is re-ordering
MMIOs. Re-ordered MMIOs to the send mailbox can cause bogus tx
completions and hit BUG_ON() in the tx completion path.
tg3 already has logic to handle re-ordered MMIOs by flushing the MMIOs
that must be strictly ordered (such as the send mailbox). Determining
when to enable the flush is currently a manual process of adding known
chipsets to a list.
The new code replaces the BUG_ON() in the tx completion path with the
call to tg3_tx_recover(). It will set the TG3_FLAG_MBOX_WRITE_REORDER
flag and reset the chip later in the workqueue to recover and start
flushing MMIOs to the mailbox.
A message to report the problem will be printed. We will then decide
whether or not to add the host bridge to the list of chipsets that do
re-ordering.
We may add some additional code later to print the host bridge's ID so
that the user can report it more easily.
The assumption that re-ordering can only happen on x86 systems is also
removed.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Samuel Ortiz [Thu, 25 May 2006 23:19:22 +0000 (16:19 -0700)]
[IRDA]: Initial support for MCS7780 based dongles
The MosChip MCS7780 chipset is an IrDA USB bridge that
doesn't conform with the IrDA-USB standard and thus needs
its separate driver.
Tested on an actual MCS7780 based dongle.
Original implementation by Brian Pugh <bpugh@cs.pdx.edu>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 May 2006 23:11:14 +0000 (16:11 -0700)]
[TCP]: tcp_rcv_rtt_measure_ts() call in pure-ACK path is superfluous
We only want to take receive RTT mesaurements for data
bearing frames, here in the header prediction fast path
for a pure-sender, we know that we have a pure-ACK and
thus the checks in tcp_rcv_rtt_mesaure_ts() will not pass.
Signed-off-by: David S. Miller <davem@davemloft.net>
[LLC]: allow applications to get copy of kernel datagrams
It is legal for an application to bind to a SAP that is also being
used by the kernel. This happens if the bridge module binds to the
STP SAP, and the user wants to have a daemon for STP as well.
It is possible to have kernel doing STP on one bridge, but
let application do RSTP on another bridge.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
LLC receive is broken for SOCK_DGRAM.
If an application does recv() on a datagram socket and there
is no data present, don't return "not connected". Instead, just
do normal datagram semantics.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Thu, 25 May 2006 20:26:53 +0000 (13:26 -0700)]
[I/OAT]: Do not use for_each_cpu().
for_each_cpu() is going away (and is gone in -mm).
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roland Dreier [Sun, 18 Jun 2006 03:44:49 +0000 (20:44 -0700)]
IB/uverbs: Don't serialize with ib_uverbs_idr_mutex
Currently, all userspace verbs operations that call into the kernel
are serialized by ib_uverbs_idr_mutex. This can be a scalability
issue for some workloads, especially for devices driven by the ipath
driver, which needs to call into the kernel even for datapath
operations.
Fix this by adding reference counts to the userspace objects, and then
converting ib_uverbs_idr_mutex into a spinlock that only protects the
idrs long enough to take a reference on the object being looked up.
Because remove operations may fail, we have to do a slightly funky
two-step deletion, which is described in the comments at the top of
uverbs_cmd.c.
This also still leaves ib_uverbs_idr_lock as a single lock that is
possibly subject to contention. However, the lock hold time will only
be a single idr operation, so multiple threads should still be able to
make progress, even if ib_uverbs_idr_lock is being ping-ponged.
Surprisingly, these changes even shrink the object code:
Roland Dreier [Sun, 18 Jun 2006 03:37:41 +0000 (20:37 -0700)]
IB/mthca: Make all device methods truly reentrant
Documentation/infiniband/core_locking.txt says:
All of the methods in struct ib_device exported by a low-level
driver must be fully reentrant. The low-level driver is required to
perform all synchronization necessary to maintain consistency, even
if multiple function calls using the same object are run
simultaneously.
However, mthca's modify_qp, modify_srq and resize_cq methods are
currently not reentrant. Add a mutex to the QP, SRQ and CQ structures
so that these calls can be properly serialized.
Roland Dreier [Sun, 18 Jun 2006 03:37:41 +0000 (20:37 -0700)]
IB/mthca: Fix memory leak on modify_qp error paths
Some error paths after the mthca_alloc_mailbox() call in mthca_modify_qp()
just do a "return -EINVAL" without freeing the mailbox. Convert these
returns to "goto out" to avoid leaking the mailbox storage.
Roland Dreier [Sun, 18 Jun 2006 03:37:40 +0000 (20:37 -0700)]
IB/uverbs: Don't decrement usecnt on error paths
In error paths when destroying an object, uverbs should not decrement
associated objects' usecnt, since ib_dereg_mr(), ib_destroy_qp(),
etc. already do that.
Sean Hefty [Sun, 18 Jun 2006 03:37:39 +0000 (20:37 -0700)]
IB/ucm: Get rid of duplicate P_Key parameter
The P_Key is provided into a SIDR REQ in two places, once as a
parameter, and again in the path record. Remove the P_Key as a
parameter and always use the one given in the path record.
This change has no practical effect on ABI functionality.
Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Ishai Rabinovitz [Sun, 18 Jun 2006 03:37:38 +0000 (20:37 -0700)]
IB/srp: Factor out common request reset code
Misc cleanups in ib_srp:
1) I think that it is more efficient to move the req entries from req_list
to free_list in srp_reconnect_target (rather than rebuild the free_list).
(In any case this code is shorter).
2) This allows us to reuse code in srp_reset_device and srp_reconnect_target
and call a new function srp_reset_req.
Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Ramachandra K [Sun, 18 Jun 2006 03:37:38 +0000 (20:37 -0700)]
IB/srp: Support SRP rev. 10 targets
There has been a change in the format of port identifiers between
revision 10 of the SRP specification and the current revision 16A.
Revision 10 specifies port identifier format as
lower 8 bytes : GUID upper 8 bytes : Extension
Whereas revision 16A specifies it as
lower 8 bytes : Extension upper 8 bytes : GUID
There are older targets (e.g. SilverStorm Virtual Fibre Channel
Bridge) which conform to revision 10 of the SRP specification.
The I/O class of revision 10 is 0xFF00 and the I/O class of revision
16A is 0x0100.
For supporting older targets, this patch:
1) Adds a new optional target creation parameter "io_class". Default
value of io_class is 0x0100 (i.e. revision 16A)
2) Uses the correct port identifier format for targets with IO class
of 0xFF00 (i.e. conforming to revision 10)
Signed-off-by: Ramachandra K <rkuchimanchi@silverstorm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Ramachandra K [Sun, 18 Jun 2006 03:37:38 +0000 (20:37 -0700)]
[SCSI] srp.h: Add I/O Class values
Add enum values for I/O Class values from rev. 10 and rev. 16a SRP
drafts. The values are used to detect targets that implement obsolete
revisions of SRP, so that the initiator can use the old format for
port identifier when connecting to them.
Signed-off-by: Ramachandra K <rkuchimanchi@silverstorm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Or Gerlitz [Sun, 18 Jun 2006 03:37:37 +0000 (20:37 -0700)]
IB/fmr: Use device's max_map_map_per_fmr attribute in FMR pool.
When creating a FMR pool, query the IB device and use the returned
max_map_map_per_fmr attribute as for the max number of FMR remaps. If
the device does not suport querying this attribute, use the original
IB_FMR_MAX_REMAPS (32) default.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Or Gerlitz [Sun, 18 Jun 2006 03:37:37 +0000 (20:37 -0700)]
IB/mthca: Fill in max_map_per_fmr device attribute
Report the true max_map_per_fmr value from mthca_query_device(),
taking into account the change in FMR remapping introduced by the
Sinai performance optimization.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Leonid Arsh [Sun, 18 Jun 2006 03:37:36 +0000 (20:37 -0700)]
IB/mthca: Add client reregister event generation
Change the mthca snoop of MADs that set PortInfo to check if the SM
has set the client reregister bit, and if it has, generate a client
reregister event. If the bit is not set, just generate a LID change
event as usual.
Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Jack Morgenstein [Mon, 29 May 2006 16:14:05 +0000 (19:14 +0300)]
IPoIB: Fix kernel unaligned access on ia64
Fix misaligned access faults on ia64: never cast a misaligned
neighbour->ha + 4 pointer to union ib_gid type; pass a void * pointer
instead. The memcpy was being optimized to use full word accesses
because the compiler thought that union ib_gid is always aligned.
The cast in IPOIB_GID_ARG is safe, since it is fixed to access each
byte separately.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Sun, 18 Jun 2006 03:37:34 +0000 (20:37 -0700)]
IPoIB: Avoid using stale last_send counter when reaping AHs
The comparisons of priv->tx_tail to ah->last_send in ipoib_free_ah()
and ipoib_post_receive() are slightly unsafe, because priv->tx_lock is
not held and hence a stale value of ah->last_send might be used, which
would lead to freeing an AH before the driver was really done with it.
The simple way to fix this is to the optimization of early free from
ipoib_free_ah() and unconditionally queue AHs for reaping, and then
take priv->tx_lock in __ipoib_reap_ah().