12 years agoRDMA/addr: Keep pointer to netdevice in struct rdma_dev_addr
Or Gerlitz [Tue, 15 Jul 2008 06:48:53 +0000]
RDMA/addr: Keep pointer to netdevice in struct rdma_dev_addr

Keep a pointer to the local (src) netdevice in struct rdma_dev_addr,
and copy it in as part of rdma_copy_addr().  Use rdma_translate_ip()
in cma_new_conn_id() to reduce some code duplication and also make
sure the src_dev member gets set.

In a high-availability configuration the netdevice pointer can be used
by the RDMA CM to align RDMA sessions to use the same links as the IP
stack does under fail-over and route change cases.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA/cxgb3: Fixes for zero STag
Steve Wise [Tue, 15 Jul 2008 06:48:53 +0000]
RDMA/cxgb3: Fixes for zero STag

Handling the zero STag in receive work request requires some extra
logic in the driver:

 - Only set the QP_PRIV bit for kernel mode QPs.

- Add a zero STag build function for recv wrs. The uP needs a PBL
  allocated and passed down in the recv WR so it can construct a HW
  PBL for the zero STag S/G entries.  Note: we need to place a few
  restrictions on zero STag usage because of this:

  1) all SGEs in a recv WR must either be zero STag or not.  No mixing.

  2) an individual SGE length cannot exceed 128MB for a zero-stag SGE.
     This should be OK since it's not really practical to allocate
     such a large chunk of pinned contiguous DMA mapped memory.

- Add an optimized non-zero-STag recv wr format for kernel users.
  This is needed to optimize both zero and non-zero STag cracking in
  the recv path for kernel users.

 - Remove the iwch_ prefix from the static build functions.

 - Bump required FW version.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>

12 years agoRDMA/core: Add local DMA L_Key support
Steve Wise [Tue, 15 Jul 2008 06:48:53 +0000]
RDMA/core: Add local DMA L_Key support

- Change the IB_DEVICE_ZERO_STAG flag to the transport-neutral name
  IB_DEVICE_LOCAL_DMA_LKEY, which is used by iWARP RNICs to indicate 0
  STag support and IB HCAs to indicate reserved L_Key support.

- Add a u32 local_dma_lkey member to struct ib_device.  Drivers fill
  this in with the appropriate local DMA L_Key (if they support it).

- Fix up the drivers using this flag.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/mthca: Fix check of max_send_sge for special QPs
Roland Dreier [Tue, 15 Jul 2008 06:48:52 +0000]
IB/mthca: Fix check of max_send_sge for special QPs

The MLX transport requires two extra gather entries for sends (one for
the header and one for the checksum at the end, as the comment says).
However the code checked that max_recv_sge was not too big, instead of
checking max_send_sge as it should have.  Fix the code to check the
correct condition.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/mthca: Use round_jiffies() for catastrophic error polling timer
Roland Dreier [Tue, 15 Jul 2008 06:48:52 +0000]
IB/mthca: Use round_jiffies() for catastrophic error polling timer

Exactly when the catastrophic error polling timer function runs is not
important, so use round_jiffies() to save unnecessary wakeups.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/mthca: Remove "stop" flag for catastrophic error polling timer
Roland Dreier [Tue, 15 Jul 2008 06:48:52 +0000]
IB/mthca: Remove "stop" flag for catastrophic error polling timer

Since we use del_timer_sync() anyway, there's no need for an
additional flag to tell the timer not to rearm.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB: Double default RX/TX ring sizes
Eli Cohen [Tue, 15 Jul 2008 06:48:52 +0000]
IPoIB: Double default RX/TX ring sizes

Increase IPoIB ring sizes to twice their original sizes (RX: 128->256,
TX: 64->128) to act as a shock absorber for high traffic peaks.  With
the current settings, we have seen cases that there are many calls to
netif_stop_queue(), which causes degradation in throughput.  Also,
larger receive buffer sizes help IPoIB in CM mode to avoid experiencing
RNR NAK conditions due to insufficient receive buffers at the SRQ.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB/cm: Reduce connected mode TX object size
Eli Cohen [Tue, 15 Jul 2008 06:48:52 +0000]
IPoIB/cm: Reduce connected mode TX object size

Since IPoIB connected mode does not NETIF_F_SG, we only have one DMA
mapping per send, so we don't need a mapping[] array.  Define a new
struct with a single u64 mapping member and use it for the CM tx_ring.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/ipath: Use IEEE OUI for vendor_id reported by ibv_query_device()
Ralph Campbell [Tue, 15 Jul 2008 06:48:52 +0000]
IB/ipath: Use IEEE OUI for vendor_id reported by ibv_query_device()

The IB spe. for SubnGet(NodeInfo) and query HCA says that the vendor
ID field should be the IEEE OUI assigned to the vendor.  The ipath
driver was returning the PCI vendor ID instead.  This will affect
applications which call ibv_query_device().  The old value was
0x001fc1 or 0x001077, the new value is 0x001175.

The vendor ID doesn't appear to be exported via /sys so that should
reduce possible compatibility issues.  I'm only aware of Open MPI as a
major application which depends on this change, and they have made
necessary adjustments.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB: Use dev_set_mtu() to change mtu
Eli Cohen [Tue, 15 Jul 2008 06:48:51 +0000]
IPoIB: Use dev_set_mtu() to change mtu

When the driver sets the MTU of the net device outside of its
change_mtu method, it should make use of dev_set_mtu() instead of
directly setting the mtu field of struct netdevice.  Otherwise
functions registered to be called upon MTU change will not get called
(this is done through call_netdevice_notifiers() in dev_set_mtu()).

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB: Use rtnl lock/unlock when changing device flags
Eli Cohen [Tue, 15 Jul 2008 06:48:51 +0000]
IPoIB: Use rtnl lock/unlock when changing device flags

Use of this lock is required to synchronize changes to the netdvice's
data structs.  Also move the call to ipoib_flush_paths() after the
modification of the netdevice flags in set_mode().

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB: Get rid of ipoib_mcast_detach() wrapper
Roland Dreier [Tue, 15 Jul 2008 06:48:50 +0000]
IPoIB: Get rid of ipoib_mcast_detach() wrapper

ipoib_mcast_detach() does nothing except call ib_detach_mcast(), so just
use the core API in the one place that does a multicast group detach.

add/remove: 0/1 grow/shrink: 0/1 up/down: 0/-105 (-105)
function                                     old     new   delta
ipoib_mcast_leave                            357     319     -38
ipoib_mcast_detach                            67       -     -67

Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB: Only set Q_Key once: after joining broadcast group
Eli Cohen [Tue, 15 Jul 2008 06:48:50 +0000]
IPoIB: Only set Q_Key once: after joining broadcast group

The current code will set the Q_Key for any join of a non-sendonly
multicast group.  The operation involves a modify QP operation, which
is fairly heavyweight, and is only really required after the join of
the broadcast group.  Fix this by adding a parameter to ipoib_mcast_attach()
to control when the Q_Key is set.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB: Remove priv->mcast_mutex
Eli Cohen [Tue, 15 Jul 2008 06:48:50 +0000]
IPoIB: Remove priv->mcast_mutex

No need for a mutex around calls to ib_attach_mcast/ib_detach_mcast
since these operations are synchronized at the HW driver layer.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB: Remove unused IPOIB_MCAST_STARTED code
Eli Cohen [Tue, 15 Jul 2008 06:48:50 +0000]
IPoIB: Remove unused IPOIB_MCAST_STARTED code

The IPOIB_MCAST_STARTED flag is not used at all since commit b3e2749b
("IPoIB: Don't drop multicast sends when they can be queued"), so
remove it.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA/cxgb3: Set rkey field for new memory windows in iwch_alloc_mw()
Steve Wise [Tue, 15 Jul 2008 06:48:49 +0000]
RDMA/cxgb3: Set rkey field for new memory windows in iwch_alloc_mw()

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA/nes: Get rid of ring_doorbell parameter of nes_post_cqp_request()
Roland Dreier [Tue, 15 Jul 2008 06:48:49 +0000]
RDMA/nes: Get rid of ring_doorbell parameter of nes_post_cqp_request()

Every caller of nes_post_cqp_request() passed it NES_CQP_REQUEST_RING_DOORBELL,
so just remove that parameter and always ring the doorbell.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Faisal Latif <flatif@neteffect.com>

12 years agoRDMA/cxgb3: Propagate HW page size capabilities
Jon Mason [Tue, 15 Jul 2008 06:48:49 +0000]
RDMA/cxgb3: Propagate HW page size capabilities

cxgb3 does not currently report the page size capabilities, and
incorrectly reports them internally.

This version changes the bit-shifting to a static value (per Steve's
request).

Signed-off-by: Jon Mason <jon@opengridcomputing.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA/nes: Encapsulate logic nes_put_cqp_request()
Roland Dreier [Tue, 15 Jul 2008 06:48:49 +0000]
RDMA/nes: Encapsulate logic nes_put_cqp_request()

The iw_nes driver repeats the logic

if (atomic_dec_and_test(&cqp_request->refcount)) {
if (cqp_request->dynamic) {
kfree(cqp_request);
} else {
spin_lock_irqsave(&nesdev->cqp.lock, flags);
list_add_tail(&cqp_request->list, &nesdev->cqp_avail_reqs);
spin_unlock_irqrestore(&nesdev->cqp.lock, flags);
}
}

over and over.  Wrap this up in functions nes_free_cqp_request() and
nes_put_cqp_request() to simplify such code.

In addition to making the source smaller and more readable, this shrinks
the compiled code quite a bit:

add/remove: 2/0 grow/shrink: 0/13 up/down: 164/-1692 (-1528)
function                                     old     new   delta
nes_free_cqp_request                           -     147    +147
nes_put_cqp_request                            -      17     +17
nes_modify_qp                               2316    2293     -23
nes_hw_modify_qp                             737     657     -80
nes_dereg_mr                                 945     860     -85
flush_wqes                                   501     416     -85
nes_manage_apbvt                             648     560     -88
nes_reg_mr                                  1117    1026     -91
nes_cqp_ce_handler                           927     769    -158
nes_alloc_mw                                1052     884    -168
nes_create_qp                               5314    5141    -173
nes_alloc_fmr                               2212    2035    -177
nes_destroy_cq                              1097     918    -179
nes_create_cq                               2787    2598    -189
nes_dealloc_mw                               762     566    -196

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Faisal Latif <flatif@neteffect.com>

12 years agoIPoIB: Refresh paths instead of flushing them on SM change events
Moni Shoua [Tue, 15 Jul 2008 06:48:49 +0000]
IPoIB: Refresh paths instead of flushing them on SM change events

The patch tries to solve the problem of device going down and paths being
flushed on an SM change event. The method is to mark the paths as candidates for
refresh (by setting the new valid flag to 0), and wait for an ARP
probe a new path record query.

The solution requires a different and less intrusive handling of SM
change event. For that, the second argument of the flush function
changes its meaning from a boolean flag to a level.  In most cases, SM
failover doesn't cause LID change so traffic won't stop.  In the rare
cases of LID change, the remote host (the one that hadn't changed its
LID) will lose connectivity until paths are refreshed. This is no
worse than the current state.  In fact, preventing the device from
going down saves packets that otherwise would be lost.

Signed-off-by: Moni Levy <monil@voltaire.com>
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/ehca: Make device table externally visible
Joachim Fenkes [Tue, 15 Jul 2008 06:48:49 +0000]
IB/ehca: Make device table externally visible

This gives ehca an autogenerated modalias and therefore enables automatic loading.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB: add LRO support
Vladimir Sokolovsky [Tue, 15 Jul 2008 06:48:48 +0000]
IPoIB: add LRO support

Add "ipoib_use_lro" module parameter to enable LRO and an
"ipoib_lro_max_aggr" module parameter to set the max number of packets
to be aggregated.  Make LRO controllable and LRO statistics accessible
through ethtool.

Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB: Use multicast loopback blocking if available
Ron Livne [Tue, 15 Jul 2008 06:48:48 +0000]
IPoIB: Use multicast loopback blocking if available

Set IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK for IPoIB's UD QPs if
supported by the underlying device.  This creates an improvement of up
to 39% in bandwidth when sending multicast packets with IPoIB, and an
improvment of 12% in cpu usage.

Signed-off-by: Ron Livne <ronli@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/mlx4: Add support for blocking multicast loopback packets
Ron Livne [Tue, 15 Jul 2008 06:48:48 +0000]
IB/mlx4: Add support for blocking multicast loopback packets

Add support for handling the IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK
flag by using the per-multicast group loopback blocking feature of
mlx4 hardware.

Signed-off-by: Ron Livne <ronli@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/core: Add support for multicast loopback blocking
Ron Livne [Tue, 15 Jul 2008 06:48:48 +0000]
IB/core: Add support for multicast loopback blocking

This patch also adds a creation flag for QPs,
IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK, which when set means that
multicast sends from the QP to a group that the QP is attached to will
not be looped back to the QP's receive queue.  This can be used to
save receive resources when a consumer does not want a local copy of
multicast traffic; for example IPoIB must waste CPU time throwing away
such local copies of multicast traffic.

This patch also adds a device capability flag that shows whether a
device supports this feature or not.

Signed-off-by: Ron Livne <ronli@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA/cxgb3: Add support for protocol statistics
Steve Wise [Tue, 15 Jul 2008 06:48:48 +0000]
RDMA/cxgb3: Add support for protocol statistics

- Add a new rdma ctl command called RDMA_GET_MIB to the cxgb3 low
  level driver to obtain the protocol mib from the rnic hardware.

- Add new iw_cxgb3 provider method to get the MIB from the low level
  driver.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA/core: Add iWARP protocol statistics attributes in sysfs
Steve Wise [Tue, 15 Jul 2008 06:48:48 +0000]
RDMA/core: Add iWARP protocol statistics attributes in sysfs

This patch adds a sysfs attribute group called "proto_stats" under
/sys/class/infiniband/$device/ and populates this group with protocol
statistics if they exist for a given device.  Currently, only iWARP
stats are defined, but the code is designed to allow InfiniBand
protocol stats if they become available.  These stats are per-device
and more importantly -not- per port.

Details:

- Add union rdma_protocol_stats in ib_verbs.h.  This union allows
  defining transport-specific stats.  Currently only iwarp stats are
  defined.

- Add struct iw_protocol_stats to define the current set of iwarp
  protocol stats.

- Add new ib_device method called get_proto_stats() to return protocol
  statistics.

- Add logic in core/sysfs.c to create iwarp protocol stats attributes
  if the device is an RNIC and has a get_proto_stats() method.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB/cm: Fix racy use of receive WR/SGL in ipoib_cm_post_receive_nonsrq()
Roland Dreier [Tue, 15 Jul 2008 06:48:47 +0000]
IPoIB/cm: Fix racy use of receive WR/SGL in ipoib_cm_post_receive_nonsrq()

For devices that don't support SRQs, ipoib_cm_post_receive_nonsrq() is
called from both ipoib_cm_handle_rx_wc() and ipoib_cm_nonsrq_init_rx(),
and these two callers are not synchronized against each other.
However, ipoib_cm_post_receive_nonsrq() always reuses the same receive
work request and scatter list structures, so multiple callers can end
up stepping on each other, which leads to posting garbled work
requests.

Fix this by having the caller pass in the ib_recv_wr and ib_sge
structures to use, and allocating new local structures in
ipoib_cm_nonsrq_init_rx().

Based on a patch by Pradeep Satyanarayana <pradeep@us.ibm.com> and
David Wilder <dwilder@us.ibm.com>, with debugging help from Hoang-Nam
Nguyen <hnguyen@de.ibm.com>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA/cma: Add missing newlines to printk()s
Roland Dreier [Tue, 15 Jul 2008 06:48:47 +0000]
RDMA/cma: Add missing newlines to printk()s

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>

12 years agoRDMA/cxgb3: Remove write-only iwch_rnic_attributes fields
Roland Dreier [Tue, 15 Jul 2008 06:48:47 +0000]
RDMA/cxgb3: Remove write-only iwch_rnic_attributes fields

The members struct iwch_rnic_attributes.vendor_id and .vendor_part_id
are write-only, so we might as well get rid of them.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>

12 years agoRDMA/cxgb3: Fix up some ib_device_attr fields
Steve Wise [Tue, 15 Jul 2008 06:48:47 +0000]
RDMA/cxgb3: Fix up some ib_device_attr fields

- set fw_ver
- set hw_ver
- set max_qp_wr to something reasonable
- set max_cqe to something reasonable

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/ehca: In case of lost interrupts, trigger EOI to reenable interrupts
Stefan Roscher [Tue, 15 Jul 2008 06:48:47 +0000]
IB/ehca: In case of lost interrupts, trigger EOI to reenable interrupts

During corner case testing, we noticed that some versions of ehca do
not properly transition to interrupt done in special load situations.
This can be resolved by periodically triggering EOI through H_EOI, if
EQEs are pending.

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/ehca: Reject receive work requests if QP is in RESET state
Joachim Fenkes [Tue, 15 Jul 2008 06:48:47 +0000]
IB/ehca: Reject receive work requests if QP is in RESET state

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/mlx4: Remove extra code for RESET->ERR QP state transition
Roland Dreier [Tue, 15 Jul 2008 06:48:46 +0000]
IB/mlx4: Remove extra code for RESET->ERR QP state transition

Commit 65adfa91 ("IB/mlx4: Fix RESET to RESET and RESET to ERROR
transitions") added some extra code to handle a QP state transition
from RESET to ERROR.  However, the latest 1.2.1 version of the IB spec
has clarified that this transition is actually not allowed, so we can
remove this extra code again.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/mthca: Remove extra code for RESET->ERR QP state transition
Roland Dreier [Tue, 15 Jul 2008 06:48:46 +0000]
IB/mthca: Remove extra code for RESET->ERR QP state transition

Commit b18aad71 ("IB/mthca: Fix RESET to ERROR transition") added some
extra code to handle a QP state transition from RESET to ERROR.
However, the latest 1.2.1 version of the IB spec has clarified that
this transition is actually not allowed, so we can remove this extra
code again.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/core: Reset to error QP state transition is not allowed
Ralph Campbell [Tue, 15 Jul 2008 06:48:46 +0000]
IB/core: Reset to error QP state transition is not allowed

I was reviewing the QP state transition diagram in the IB 1.2.1 spec
and the code for qp_state_table[], and noticed that the code allows a
QP to be modified from IB_QPS_RESET to IB_QPS_ERR whereas the notes
for figure 124 (pg 457) specifically says that this transition isn't
allowed.  This is a clarification from earlier versions of the IB
spec, which were ambiguous in this area and suggested that the RESET
to ERR transition was allowed.

Fix up the qp_state_table[] to make RESET->ERR not allowed.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/mlx4: Pass congestion management class MADs to the HCA
Eli Cohen [Tue, 15 Jul 2008 06:48:45 +0000]
IB/mlx4: Pass congestion management class MADs to the HCA

ConnectX HCAs support the IB_MGMT_CLASS_CONG_MGMT management class, so
process MADs of this class through the MAD_IFC firmware command.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/mlx4: Configure QPs' max message size based on real device capability
Eli Cohen [Tue, 15 Jul 2008 06:48:45 +0000]
IB/mlx4: Configure QPs' max message size based on real device capability

ConnectX returns the max message size it supports through the
QUERY_DEV_CAP firmware command.  When modifying a QP to RTR, the max
message size for the QP must be specified.  This value must not exceed
the value declared through QUERY_DEV_CAP.  The current code ignores
the max allowed size and unconditionally sets the value to 2^31.  This
patch sets all QPs to the max value allowed as returned from firmware.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA/cxgb3: MEM_MGT_EXTENSIONS support
Steve Wise [Tue, 15 Jul 2008 06:48:45 +0000]
RDMA/cxgb3: MEM_MGT_EXTENSIONS support

- set IB_DEVICE_MEM_MGT_EXTENSIONS capability bit if fw supports it.
- set max_fast_reg_page_list_len device attribute.
- add iwch_alloc_fast_reg_mr function.
- add iwch_alloc_fastreg_pbl
- add iwch_free_fastreg_pbl
- adjust the WQ depth for kernel mode work queues to account for
  fastreg possibly taking 2 WR slots.
- add fastreg_mr work request support.
- add local_inv work request support.
- add send_with_inv and send_with_se_inv work request support.
- removed useless duplicate enums/defines for TPT/MW/MR stuff.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA/core: Add memory management extensions support
Steve Wise [Tue, 15 Jul 2008 06:48:45 +0000]
RDMA/core: Add memory management extensions support

This patch adds support for the IB "base memory management extension"
(BMME) and the equivalent iWARP operations (which the iWARP verbs
mandates all devices must implement).  The new operations are:

 - Allocate an ib_mr for use in fast register work requests.

 - Allocate/free a physical buffer lists for use in fast register work
   requests.  This allows device drivers to allocate this memory as
   needed for use in posting send requests (eg via dma_alloc_coherent).

 - New send queue work requests:
   * send with remote invalidate
   * fast register memory region
   * local invalidate memory region
   * RDMA read with invalidate local memory region (iWARP only)

Consumer interface details:

 - A new device capability flag IB_DEVICE_MEM_MGT_EXTENSIONS is added
   to indicate device support for these features.

 - New send work request opcodes IB_WR_FAST_REG_MR, IB_WR_LOCAL_INV,
   IB_WR_RDMA_READ_WITH_INV are added.

 - A new consumer API function, ib_alloc_mr() is added to allocate
   fast register memory regions.

 - New consumer API functions, ib_alloc_fast_reg_page_list() and
   ib_free_fast_reg_page_list() are added to allocate and free
   device-specific memory for fast registration page lists.

 - A new consumer API function, ib_update_fast_reg_key(), is added to
   allow the key portion of the R_Key and L_Key of a fast registration
   MR to be updated.  Consumers call this if desired before posting
   a IB_WR_FAST_REG_MR work request.

Consumers can use this as follows:

 - MR is allocated with ib_alloc_mr().

 - Page list memory is allocated with ib_alloc_fast_reg_page_list().

 - MR R_Key/L_Key "key" field is updated with ib_update_fast_reg_key().

 - MR made VALID and bound to a specific page list via
   ib_post_send(IB_WR_FAST_REG_MR)

 - MR made INVALID via ib_post_send(IB_WR_LOCAL_INV),
   ib_post_send(IB_WR_RDMA_READ_WITH_INV) or an incoming send with
   invalidate operation.

 - MR is deallocated with ib_dereg_mr()

 - page lists dealloced via ib_free_fast_reg_page_list().

Applications can allocate a fast register MR once, and then can
repeatedly bind the MR to different physical block lists (PBLs) via
posting work requests to a send queue (SQ).  For each outstanding
MR-to-PBL binding in the SQ pipe, a fast_reg_page_list needs to be
allocated (the fast_reg_page_list is owned by the low-level driver
from the consumer posting a work request until the request completes).
Thus pipelining can be achieved while still allowing device-specific
page_list processing.

The 32-bit fast register memory key/STag is composed of a 24-bit index
and an 8-bit key.  The application can change the key each time it
fast registers thus allowing more control over the peer's use of the
key/STag (ie it can effectively be changed each time the rkey is
rebound to a page list).

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIPoIB: Copy small received SKBs in connected mode
Eli Cohen [Tue, 15 Jul 2008 06:48:44 +0000]
IPoIB: Copy small received SKBs in connected mode

The connected mode implementation in the IPoIB driver has a large
overhead in the way SKBs are handled in the receive flow.  It usually
allocates an SKB with as big as was used in the currently received SKB
and moves unused fragments from the old SKB to the new one. This
involves a loop on all the remaining fragments and incurs overhead on
the CPU.  This patch, for small SKBs, allocates an SKB just large
enough to contain the received data and copies to it the data from the
received SKB.  The newly allocated SKB is passed to the stack and the
old SKB is reposted.

When running netperf, UDP small messages, without this pach I get:

    UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
    14.4.3.178 (14.4.3.178) port 0 AF_INET
    Socket  Message  Elapsed      Messages
    Size    Size     Time         Okay Errors   Throughput
    bytes   bytes    secs            #      #   10^6bits/sec

    114688     128   10.00     5142034      0     526.31
    114688           10.00     1130489            115.71

With this patch I get both send and receive at ~315 mbps.

The reason that send performance actually slows down is as follows:
When using this patch, the overhead of the CPU for handling RX packets
is dramatically reduced.  As a result, we do not experience RNR NAK
messages from the receiver which cause the connection to be closed and
reopened again; when the patch is not used, the receiver cannot handle
the packets fast enough so there is less time to post new buffers and
hence the mentioned RNR NACKs.  So what happens is that the
application *thinks* it posted a certain number of packets for
transmission but these packets are flushed and do not really get
transmitted.  Since the connection gets opened and closed many times,
each time netperf gets the CPU time that otherwise would have been
given to IPoIB to actually transmit the packets.  This can be verified
when looking at the port counters -- the output of ifconfig and the
oputput of netperf (this is for the case without the patch):

    tx packets
    ==========
    port counter:   1,543,996
    ifconfig:       1,581,426
    netperf:        5,142,034

    rx packets
    ==========
    netperf         1,1304,089

Signed-off-by: Eli Cohen <eli@mellanox.co.il>

12 years agoRDMA: Remove subversion $Id tags
Roland Dreier [Tue, 15 Jul 2008 06:48:44 +0000]
RDMA: Remove subversion $Id tags

They don't get updated by git and so they're worse than useless.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA: Improve include file coding style
Dotan Barak [Tue, 15 Jul 2008 06:48:44 +0000]
RDMA: Improve include file coding style

Remove subversion $Id lines and improve readability by fixing other
coding style problems pointed out by checkpatch.pl.

Signed-off-by: Dotan Barak <dotanba@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/ipath: Simplify code using ARRAY_SIZE() macro
Robert P. J. Day [Tue, 15 Jul 2008 06:48:44 +0000]
IB/ipath: Simplify code using ARRAY_SIZE() macro

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/mlx4: Optimize QP stamping
Eli Cohen [Tue, 15 Jul 2008 06:48:44 +0000]
IB/mlx4: Optimize QP stamping

The idea is that for QPs with fixed size work requests (eg selective
signaling QPs), before stamping the WQE, we read the value of the DS
field, which gives the effective size of the descriptor as used in the
previous post.  Then we stamp only that area, since the rest of the
descriptor is already stamped.

When initializing the send queue buffer, make sure the DS field is
initialized to the max descriptor size so that the subsequent stamping
will be done on the entire descriptor area.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/sa: Fail requests made while creating new SM AH
Moni Shoua [Tue, 15 Jul 2008 06:48:43 +0000]
IB/sa: Fail requests made while creating new SM AH

This patch solves a race that occurs after an event occurs that causes
the SA query module to flush its SM address handle (AH).  When SM AH
becomes invalid and needs an update it is handled by the global
workqueue.  On the other hand this event is also handled in the IPoIB
driver by queuing work in the ipoib_workqueue that does multicast
joins.  Although queuing is in the right order, it is done to 2
different workqueues and so there is no guarantee that the first to be
queued is the first to be executed.

This causes a problem because IPoIB may end up sending an request to
the old SM, which will take a long time to time out (since the old SM
is gone); this leads to a much longer than necessary interruption in
multicast traffer.

The patch sets the SA query module's SM AH to NULL when the event
occurs, and until update_sm_ah() is done, any request that needs sm_ah
fails with -EAGAIN return status.

For consumers, the patch doesn't make things worse.  Before the patch,
MADs are sent to the wrong SM so the request gets lost.  Consumers can
be improved if they examine the return code and respond to EAGAIN
properly but even without an improvement the situation is not getting
worse.

Signed-off-by: Moni Levy <monil@voltaire.com>
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA: Fix license text
Sean Hefty [Tue, 15 Jul 2008 06:48:43 +0000]
RDMA: Fix license text

The license text for several files references a third software license
that was inadvertently copied in.  Update the license to what was
intended.  This update was based on a request from HP.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoRDMA/nes: Remove unnecessary memset()
Christophe Jaillet [Tue, 15 Jul 2008 06:48:43 +0000]
RDMA/nes: Remove unnecessary memset()

Remove an explicit memset(..., 0, ...) of a 'listener' structure
allocated with kzalloc().

Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Acked-by: Faisal Latif <faisal@neteffect.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agoIB/srp: Remove use of cached P_Key/GID queries
Roland Dreier [Tue, 15 Jul 2008 06:48:43 +0000]
IB/srp: Remove use of cached P_Key/GID queries

The SRP initiator is currently using ib_find_cached_pkey() and
ib_get_cached_gid() in situations where the uncached ib_find_pkey()
and ib_query_gid() functions serve just as well: sleeping is allowed
and performance is not an issue.  Since we want to eliminate the
cached operations in the long term, convert SRP to use the uncached
variants.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

12 years agofirmware: Correct dependency on CONFIG_EXTRA_FIRMWARE_DIR
David Woodhouse [Tue, 15 Jul 2008 00:50:24 +0000]
firmware: Correct dependency on CONFIG_EXTRA_FIRMWARE_DIR

When CONFIG_EXTRA_FIRMWARE_DIR gets changed, the filename in the .S file
(which uses .incbin to include the binary) needs to change. When we
renamed the BUILTIN_FIRMWARE_DIR option to EXTRA_FIRMWARE_DIR, we forgot
to update the manual dependency in firmware/Makefile, so it was
depending on a non-existent file in include/config/

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

12 years agoMerge branch 'for-2.6.27' of git://git.infradead.org/users/dwmw2/firmware-2.6
Linus Torvalds [Mon, 14 Jul 2008 23:54:07 +0000]
Merge branch 'for-2.6.27' of git://git.infradead.org/users/dwmw2/firmware-2.6

* 'for-2.6.27' of git://git.infradead.org/users/dwmw2/firmware-2.6: (64 commits)
  firmware: convert sb16_csp driver to use firmware loader exclusively
  dsp56k: use request_firmware
  edgeport-ti: use request_firmware()
  edgeport: use request_firmware()
  vicam: use request_firmware()
  dabusb: use request_firmware()
  cpia2: use request_firmware()
  ip2: use request_firmware()
  firmware: convert Ambassador ATM driver to request_firmware()
  whiteheat: use request_firmware()
  ti_usb_3410_5052: use request_firmware()
  emi62: use request_firmware()
  emi26: use request_firmware()
  keyspan_pda: use request_firmware()
  keyspan: use request_firmware()
  ttusb-budget: use request_firmware()
  kaweth: use request_firmware()
  smctr: use request_firmware()
  firmware: convert ymfpci driver to use firmware loader exclusively
  firmware: convert maestro3 driver to use firmware loader exclusively
  ...

Fix up trivial conflicts with BKL removal in drivers/char/dsp56k.c and
drivers/char/ip2/ip2main.c manually.

12 years agoMerge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Mon, 14 Jul 2008 23:06:58 +0000]
Merge branch 'for-linus' of /home/rmk/linux-2.6-arm

* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (241 commits)
  [ARM] 5171/1: ep93xx: fix compilation of modules using clocks
  [ARM] 5133/2: at91sam9g20 defconfig file
  [ARM] 5130/4: Support for the at91sam9g20
  [ARM] 5160/1: IOP3XX: gpio/gpiolib support
  [ARM] at91: Fix NAND FLASH timings for at91sam9x evaluation kits.
  [ARM] 5084/1: zylonite: Register AC97 device
  [ARM] 5085/2: PXA: Move AC97 over to the new central device declaration model
  [ARM] 5120/1: pxa: correct platform driver names for PXA25x and PXA27x UDC drivers
  [ARM] 5147/1: pxaficp_ir: drop pxa_gpio_mode calls, as pin setting
  [ARM] 5145/1: PXA2xx: provide api to control IrDA pins state
  [ARM] 5144/1: pxaficp_ir: cleanup includes
  [ARM] pxa: remove pxa_set_cken()
  [ARM] pxa: allow clk aliases
  [ARM] Feroceon: don't disable BPU on boot
  [ARM] Orion: LED support for HP mv2120
  [ARM] Orion: add RD88F5181L-FXO support
  [ARM] Orion: add RD88F5181L-GE support
  [ARM] Orion: add Netgear WNR854T support
  [ARM] s3c2410_defconfig: update for current build
  [ARM] Acer n30: Minor style and indentation fixes.
  ...

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
David Woodhouse [Mon, 14 Jul 2008 22:49:04 +0000]
Merge git://git./linux/kernel/git/torvalds/linux-2.6.git

Conflicts:

sound/pci/Kconfig

12 years ago[ARM] Merge most of the PXA work for initial merge
Russell King [Mon, 14 Jul 2008 20:28:25 +0000]
[ARM] Merge most of the PXA work for initial merge

This includes PXA work up to the SPI changes for the initial merge,
since e172274ccc55d20536fbdceb6131f38e288541e0 depends on the SPI
tree being merged.

Conflicts:

arch/arm/configs/em_x270_defconfig
arch/arm/configs/xm_x270_defconfig

12 years agoMerge branch 'core/softirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip...
Linus Torvalds [Mon, 14 Jul 2008 22:28:42 +0000]
Merge branch 'core/softirq' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core/softirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  softirq: remove irqs_disabled warning from local_bh_enable
  softirq: remove initialization of static per-cpu variable
  Remove argument from open_softirq which is always NULL

12 years agoMerge branch 'core/rodata' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux...
Linus Torvalds [Mon, 14 Jul 2008 22:28:10 +0000]
Merge branch 'core/rodata' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core/rodata' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  move BUG_TABLE into RODATA

12 years agoMerge branch 'core/printk' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux...
Linus Torvalds [Mon, 14 Jul 2008 22:27:43 +0000]
Merge branch 'core/printk' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core/printk' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, generic: mark early_printk as asmlinkage
  printk: export console_drivers
  printk: remember the message level for multi-line output
  printk: refactor processing of line severity tokens
  printk: don't prefer unsuited consoles on registration
  printk: clean up recursion check related static variables
  namespacecheck: more kernel/printk.c fixes
  namespacecheck: fix kernel printk.c

12 years agox86: MMIOTRACE should not default to on
Linus Torvalds [Mon, 14 Jul 2008 22:03:25 +0000]
x86: MMIOTRACE should not default to on

Even the help-text makes it clear that normal people shouldn't enable
it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

12 years agoMerge branch 'core/locking' of git://git.kernel.org/pub/scm/linux/kernel/git/tip...
Linus Torvalds [Mon, 14 Jul 2008 21:55:13 +0000]
Merge branch 'core/locking' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core/locking' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  lockdep: fix kernel/fork.c warning
  lockdep: fix ftrace irq tracing false positive
  lockdep: remove duplicate definition of STATIC_LOCKDEP_MAP_INIT
  lockdep: add lock_class information to lock_chain and output it
  lockdep: add lock_class information to lock_chain and output it
  lockdep: output lock_class key instead of address for forward dependency output
  __mutex_lock_common: use signal_pending_state()
  mutex-debug: check mutex magic before owner

Fixed up conflict in kernel/fork.c manually

12 years agoMerge branch 'sched/new-API-sched_setscheduler' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Mon, 14 Jul 2008 21:50:49 +0000]
Merge branch 'sched/new-API-sched_setscheduler' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched/new-API-sched_setscheduler' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: add new API sched_setscheduler_nocheck: add a flag to control access checks

12 years agoMerge branch 'tracing/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 14 Jul 2008 21:49:54 +0000]
Merge branch 'tracing/for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'tracing/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (228 commits)
  ftrace: build fix for ftraced_suspend
  ftrace: separate out the function enabled variable
  ftrace: add ftrace_kill_atomic
  ftrace: use current CPU for function startup
  ftrace: start wakeup tracing after setting function tracer
  ftrace: check proper config for preempt type
  ftrace: trace schedule
  ftrace: define function trace nop
  ftrace: move sched_switch enable after markers
  ftrace: prevent ftrace modifications while being kprobe'd, v2
  fix "ftrace: store mcount address in rec->ip"
  mmiotrace broken in linux-next (8-bit writes only)
  ftrace: avoid modifying kprobe'd records
  ftrace: freeze kprobe'd records
  kprobes: enable clean usage of get_kprobe
  ftrace: store mcount address in rec->ip
  ftrace: build fix with gcc 4.3
  namespacecheck: fixes
  ftrace: fix "notrace" filtering priority
  ftrace: fix printout
  ...

12 years agoMerge branch 'bkl-removal' of git://git.lwn.net/linux-2.6
Linus Torvalds [Mon, 14 Jul 2008 21:48:31 +0000]
Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6

* 'bkl-removal' of git://git.lwn.net/linux-2.6: (146 commits)
  IB/umad: BKL is not needed for ib_umad_open()
  IB/uverbs: BKL is not needed for ib_uverbs_open()
  bf561-coreb: BKL unneeded for open()
  Call fasync() functions without the BKL
  snd/PCM: fasync BKL pushdown
  ipmi: fasync BKL pushdown
  ecryptfs: fasync BKL pushdown
  Bluetooth VHCI: fasync BKL pushdown
  tty_io: fasync BKL pushdown
  tun: fasync BKL pushdown
  i2o: fasync BKL pushdown
  mpt: fasync BKL pushdown
  Remove BKL from remote_llseek v2
  Make FAT users happier by not deadlocking
  x86-mce: BKL pushdown
  vmwatchdog: BKL pushdown
  vmcp: BKL pushdown
  via-pmu: BKL pushdown
  uml-random: BKL pushdown
  uml-mmapper: BKL pushdown
  ...

12 years agofirmware: convert sb16_csp driver to use firmware loader exclusively
Jaswinder Singh [Sat, 5 Jul 2008 12:35:22 +0000]
firmware: convert sb16_csp driver to use firmware loader exclusively

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

12 years agodsp56k: use request_firmware
Jaswinder Singh [Sat, 5 Jul 2008 09:58:30 +0000]
dsp56k: use request_firmware

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

12 years agoedgeport-ti: use request_firmware()
Jaswinder Singh [Fri, 4 Jul 2008 17:36:09 +0000]
edgeport-ti: use request_firmware()

Firmware blob looks like this...
        uint8_t  MajorVersion
        uint8_t  MinorVersion
        __le16   BuildNumber
        uint8_t  data[]

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

12 years agoedgeport: use request_firmware()
Jaswinder Singh [Thu, 3 Jul 2008 11:30:23 +0000]
edgeport: use request_firmware()

Version number provided in first HEX record.

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

12 years agovicam: use request_firmware()
Jaswinder Singh [Fri, 27 Jun 2008 14:20:40 +0000]
vicam: use request_firmware()

Although it wasn't actually using ihex records before, we use the Intel
HEX record format for this firmware -- because that gives us a simple
way to split it into separate chunks internally as we need, without
loading each part as a separate file.

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

12 years agodabusb: use request_firmware()
David Woodhouse [Mon, 23 Jun 2008 10:41:04 +0000]
dabusb: use request_firmware()

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

12 years agocpia2: use request_firmware()
David Woodhouse [Mon, 23 Jun 2008 10:36:23 +0000]
cpia2: use request_firmware()

Thanks for Jaswinder Singh for converting the firmware blob itself to ihex.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

12 years agoMerge commit 'v2.6.26' into bkl-removal
Jonathan Corbet [Mon, 14 Jul 2008 21:29:34 +0000]
Merge commit 'v2.6.26' into bkl-removal

12 years agoftrace: document updates
Steven Rostedt [Mon, 14 Jul 2008 20:41:12 +0000]
ftrace: document updates

The following updates were recommended by Elias Oltmanns and Randy Dunlap.

[ updates based on Andrew Morton's comments are still to come. ]

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

12 years agoMerge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip...
Linus Torvalds [Mon, 14 Jul 2008 20:54:49 +0000]
Merge branch 'sched/for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (76 commits)
  sched_clock: and multiplier for TSC to gtod drift
  sched_clock: record TSC after gtod
  sched_clock: only update deltas with local reads.
  sched_clock: fix calculation of other CPU
  sched_clock: stop maximum check on NO HZ
  sched_clock: widen the max and min time
  sched_clock: record from last tick
  sched: fix accounting in task delay accounting & migration
  sched: add avg-overlap support to RT tasks
  sched: terminate newidle balancing once at least one task has moved over
  sched: fix warning
  sched: build fix
  sched: sched_clock_cpu() based cpu_clock(), lockdep fix
  sched: export cpu_clock
  sched: make sched_{rt,fair}.c ifdefs more readable
  sched: bias effective_load() error towards failing wake_affine().
  sched: incremental effective_load()
  sched: correct wakeup weight calculations
  sched: fix mult overflow
  sched: update shares on wakeup
  ...

12 years agoMerge branch 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip...
Linus Torvalds [Mon, 14 Jul 2008 20:43:24 +0000]
Merge branch 'x86/for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (821 commits)
  x86: make 64bit hpet_set_mapping to use ioremap too, v2
  x86: get x86_phys_bits early
  x86: max_low_pfn_mapped fix #4
  x86: change _node_to_cpumask_ptr to return const ptr
  x86: I/O APIC: remove an IRQ2-mask hack
  x86: fix numaq_tsc_disable calling
  x86, e820: remove end_user_pfn
  x86: max_low_pfn_mapped fix, #3
  x86: max_low_pfn_mapped fix, #2
  x86: max_low_pfn_mapped fix, #1
  x86_64: fix delayed signals
  x86: remove conflicting nx6325 and nx6125 quirks
  x86: Recover timer_ack lost in the merge of the NMI watchdog
  x86: I/O APIC: Never configure IRQ2
  x86: L-APIC: Always fully configure IRQ0
  x86: L-APIC: Set IRQ0 as edge-triggered
  x86: merge dwarf2 headers
  x86: use AS_CFI instead of UNWIND_INFO
  x86: use ignore macro instead of hash comment
  x86: use matching CFI_ENDPROC
  ...

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Mon, 14 Jul 2008 20:40:42 +0000]
Merge git://git./linux/kernel/git/herbert/crypto-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (50 commits)
  crypto: ixp4xx - Select CRYPTO_AUTHENC
  crypto: s390 - Respect STFL bit
  crypto: talitos - Add support for sha256 and md5 variants
  crypto: hash - Move ahash functions into crypto/hash.h
  crypto: crc32c - Add ahash implementation
  crypto: hash - Added scatter list walking helper
  crypto: prng - Deterministic CPRNG
  crypto: hash - Removed vestigial ahash fields
  crypto: hash - Fixed digest size check
  crypto: rmd - sparse annotations
  crypto: rmd128 - sparse annotations
  crypto: camellia - Use kernel-provided bitops, unaligned access helpers
  crypto: talitos - Use proper form for algorithm driver names
  crypto: talitos - Add support for 3des
  crypto: padlock - Make module loading quieter when hardware isn't available
  crypto: tcrpyt - Remove unnecessary kmap/kunmap calls
  crypto: ixp4xx - Hardware crypto support for IXP4xx CPUs
  crypto: talitos - Freescale integrated security engine (SEC) driver
  [CRYPTO] tcrypt: Add self test for des3_ebe cipher operating in cbc mode
  [CRYPTO] rmd: Use pointer form of endian swapping operations
  ...

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6
Linus Torvalds [Mon, 14 Jul 2008 20:37:29 +0000]
Merge git://git./linux/kernel/git/hskinnemoen/avr32-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6: (31 commits)
  avr32: Fix typo of IFSR in a comment in the PIO header file
  avr32: Power Management support ("standby" and "mem" modes)
  avr32: Add system device for the internal interrupt controller (intc)
  avr32: Add simple SRAM allocator
  avr32: Enable SDRAMC clock at startup
  rtc-at32ap700x: Enable wakeup
  macb: Basic suspend/resume support
  atmel_serial: Drain console TX shifter before suspending
  atmel_serial: Fix build on avr32 with CONFIG_PM enabled
  avr32: Use a quicklist for PTE allocation as well
  avr32: Use a quicklist for PGD allocation
  avr32: Cover the kernel page tables in the user PGDs
  avr32: Store virtual addresses in the PGD
  avr32: Remove useless zeroing of swapper_pg_dir at startup
  avr32: Clean up and optimize the TLB operations
  avr32: Rename at32ap.c -> pdc.c
  avr32: Move setup_platform() into chip-specific file
  avr32: Kill special exception handler sections
  avr32: Kill unneeded #include <asm/pgalloc.h> from asm/mmu_context.h
  avr32: Clean up time.c #includes
  ...

12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Mon, 14 Jul 2008 20:36:55 +0000]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (25 commits)
  security: remove register_security hook
  security: remove dummy module fix
  security: remove dummy module
  security: remove unused sb_get_mnt_opts hook
  LSM/SELinux: show LSM mount options in /proc/mounts
  SELinux: allow fstype unknown to policy to use xattrs if present
  security: fix return of void-valued expressions
  SELinux: use do_each_thread as a proper do/while block
  SELinux: remove unused and shadowed addrlen variable
  SELinux: more user friendly unknown handling printk
  selinux: change handling of invalid classes (Was: Re: 2.6.26-rc5-mm1 selinux whine)
  SELinux: drop load_mutex in security_load_policy
  SELinux: fix off by 1 reference of class_to_string in context_struct_compute_av
  SELinux: open code sidtab lock
  SELinux: open code load_mutex
  SELinux: open code policy_rwlock
  selinux: fix endianness bug in network node address handling
  selinux: simplify ioctl checking
  SELinux: enable processes with mac_admin to get the raw inode contexts
  Security: split proc ptrace checking into read vs. attach
  ...

12 years agoMerge branch 'drm-reorg' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied...
Linus Torvalds [Mon, 14 Jul 2008 20:32:24 +0000]
Merge branch 'drm-reorg' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-reorg' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm: reorganise drm tree to be more future proof.

12 years agoMerge branch 'for-linus' of git://git.alsa-project.org/alsa-kernel
Linus Torvalds [Mon, 14 Jul 2008 20:26:07 +0000]
Merge branch 'for-linus' of git://git.alsa-project.org/alsa-kernel

* 'for-linus' of git://git.alsa-project.org/alsa-kernel: (179 commits)
  ALSA: Release v1.0.17
  ALSA: correct kcalloc usage
  ALSA: ALSA driver for SGI O2 audio board
  ALSA: asoc: kbuild - only show menus for the current ASoC CPU platform.
  ALSA: ALSA driver for SGI HAL2 audio device
  ALSA: hda - Fix FSC V5505 model
  ALSA: hda - Fix missing init for unsol events on micsense model
  ALSA: hda - Fix internal mic vref pin setup
  ALSA: hda: 92hd71bxx PC Beep
  ALSA: HDA - HP dc7600 with pci sub IDs 0x103c/0x3011 belongs to hp-3013 model
  ALSA: usb-audio: add some Yamaha USB MIDI quirks
  ALSA: usb-audio: fix Yamaha KX quirk
  ALSA: ASoC: Au12x0/Au1550 PSC Audio support
  ALSA: Add Yamaha KX49 (USB MIDI controller) to usbquirks.h
  ALSA: ASoC: pxa2xx-ac97: fix warning due to missing argument in fuction declaration
  ALSA: tosa: fix compilation with new DAPM API
  ALSA: wavefront - add const
  ALSA: remove CONFIG_KMOD from sound
  ALSA: Fix a const to non-const assignment in the Digigram VXpocket sound driver
  ALSA: Fix a const pointer usage warning in the Digigram VX soundcard driver
  ...

12 years agoMerge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Mon, 14 Jul 2008 20:25:01 +0000]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (71 commits)
  [S390] sclp_tty: Fix scheduling while atomic bug.
  [S390] sclp_tty: remove ioctl interface.
  [S390] Remove P390 support.
  [S390] Cleanup vmcp printk messages.
  [S390] Cleanup lcs printk messages.
  [S390] Cleanup kprobes printk messages.
  [S390] Cleanup vmwatch printk messages.
  [S390] Cleanup dcssblk printk messages.
  [S390] Cleanup zfcp dumper printk messages.
  [S390] Cleanup vmlogrdr printk messages.
  [S390] Cleanup s390 debug feature print messages.
  [S390] Cleanup monreader printk messages.
  [S390] Cleanup appldata printk messages.
  [S390] Cleanup smsgiucv printk messages.
  [S390] Cleanup cpacf printk messages.
  [S390] Cleanup qeth print messages.
  [S390] Cleanup netiucv printk messages.
  [S390] Cleanup iucv printk messages.
  [S390] Cleanup sclp printk messages.
  [S390] Cleanup zcrypt printk messages.
  ...

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6
Linus Torvalds [Mon, 14 Jul 2008 20:24:39 +0000]
Merge git://git./linux/kernel/git/brodo/pcmcia-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6: (23 commits)
  pcmcia: Fix ide-cs sparse warning
  pcmcia: ide-cs debugging bugfix
  pcmcia: allow for longer CIS firmware files
  pcmcia: cm40x0 cdev lock_kernel() pushdown
  pcmcia: (re)move {pcmcia,pccard}_get_status
  pcmcia: kill IN_CARD_SERVICES
  pcmcia: Remove unused header file code
  pcmcia: remove unused bulkmem.h
  pcmcia: simplify pccard_validate_cis
  pcmcia: carve out ioctl adjust function to pcmcia_ioctl
  pcmcia: irq probe can be done without risking an IRQ storm
  pcmcia: Fix ti12xx_2nd_slot_empty always failing
  pcmcia: check for pointer instead of pointer address
  pcmcia: switch cm4000_cs.c to unlocked_ioctl
  pcmcia: simplify rsrc_nonstatic attributes
  pcmcia: add support CompactFlash PCMCIA support for Blackfin.
  pcmcia: remove version.h
  pcmcia: cs: kill thread_wait
  pcmcia: i82365.c: check request_irq return value
  pcmcia: fix Alchemy warnings
  ...

12 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
Linus Torvalds [Mon, 14 Jul 2008 20:15:14 +0000]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (37 commits)
  splice: fix generic_file_splice_read() race with page invalidation
  ramfs: enable splice write
  drivers/block/pktcdvd.c: avoid useless memset
  cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack)
  scsi: sr avoids useless buffer allocation
  block: blk_rq_map_kern uses the bounce buffers for stack buffers
  block: add blk_queue_update_dma_pad
  DAC960: push down BKL
  pktcdvd: push BKL down into driver
  paride: push ioctl down into driver
  block: use get_unaligned_* helpers
  block: extend queue_flag bitops
  block: request_module(): use format string
  Add bvec_merge_data to handle stacked devices and ->merge_bvec()
  block: integrity flags can't use bit ops on unsigned short
  cmdfilter: extend default read filter
  sg: fix odd style (extra parenthesis) introduced by cmd filter patch
  block: add bounce support to blk_rq_map_user_iov
  cfq-iosched: get rid of enable_idle being unused warning
  allow userspace to modify scsi command filter on per device basis
  ...

12 years agoStart using the new '%pS' infrastructure to print symbols
Linus Torvalds [Mon, 14 Jul 2008 19:12:53 +0000]
Start using the new '%pS' infrastructure to print symbols

This simplifies the code significantly, and was the whole point of the
exercise.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

12 years agoMerge branch 'auto-ftrace-next' into tracing/for-linus
Ingo Molnar [Mon, 14 Jul 2008 14:11:52 +0000]
Merge branch 'auto-ftrace-next' into tracing/for-linus

Conflicts:

arch/x86/kernel/entry_32.S
arch/x86/kernel/process_32.c
arch/x86/kernel/process_64.c
arch/x86/lib/Makefile
include/asm-x86/irqflags.h
kernel/Makefile
kernel/sched.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>

12 years agoMerge branch 'sched/for-linus' into tracing/for-linus
Ingo Molnar [Mon, 14 Jul 2008 14:11:02 +0000]
Merge branch 'sched/for-linus' into tracing/for-linus

12 years agoMerge branch 'tracing/ftrace' into auto-ftrace-next
Ingo Molnar [Mon, 14 Jul 2008 13:58:35 +0000]
Merge branch 'tracing/ftrace' into auto-ftrace-next

12 years agoMerge commit 'v2.6.26' into sched/devel
Ingo Molnar [Mon, 14 Jul 2008 10:19:19 +0000]
Merge commit 'v2.6.26' into sched/devel

12 years agoMerge branch 'sched/clock' into sched/devel
Ingo Molnar [Mon, 14 Jul 2008 10:19:13 +0000]
Merge branch 'sched/clock' into sched/devel

12 years agolockdep: fix kernel/fork.c warning
Ingo Molnar [Mon, 14 Jul 2008 10:09:28 +0000]
lockdep: fix kernel/fork.c warning

fix:

[    0.184011] ------------[ cut here ]------------
[    0.188011] WARNING: at kernel/fork.c:918 copy_process+0x1c0/0x1084()
[    0.192011] Pid: 0, comm: swapper Not tainted 2.6.26-tip-00351-g01d4a50-dirty #14521
[    0.196011]  [<c0135d48>] warn_on_slowpath+0x3c/0x60
[    0.200012]  [<c016f805>] ? __alloc_pages_internal+0x92/0x36b
[    0.208012]  [<c033de5e>] ? __spin_lock_init+0x24/0x4a
[    0.212012]  [<c01347e3>] copy_process+0x1c0/0x1084
[    0.216013]  [<c013575f>] do_fork+0xb8/0x1ad
[    0.220013]  [<c034f75e>] ? acpi_os_release_lock+0x8/0xa
[    0.228013]  [<c034ff7a>] ? acpi_os_vprintf+0x20/0x24
[    0.232014]  [<c01129ee>] kernel_thread+0x75/0x7d
[    0.236014]  [<c0a491eb>] ? kernel_init+0x0/0x24a
[    0.240014]  [<c0a491eb>] ? kernel_init+0x0/0x24a
[    0.244014]  [<c01151b0>] ? kernel_thread_helper+0x0/0x10
[    0.252015]  [<c06c6ac0>] rest_init+0x14/0x50
[    0.256015]  [<c0a498ce>] start_kernel+0x2b9/0x2c0
[    0.260015]  [<c0a4904f>] __init_begin+0x4f/0x57
[    0.264016]  =======================
[    0.268016] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.272016] enabled ExtINT on CPU#0

which occurs if CONFIG_TRACE_IRQFLAGS=y, CONFIG_DEBUG_LOCKDEP=y,
but CONFIG_PROVE_LOCKING is disabled.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

12 years agoMerge commit 'v2.6.26' into x86/core
Ingo Molnar [Mon, 14 Jul 2008 09:37:46 +0000]
Merge commit 'v2.6.26' into x86/core

12 years agolockdep: fix ftrace irq tracing false positive
Ingo Molnar [Mon, 14 Jul 2008 08:28:38 +0000]
lockdep: fix ftrace irq tracing false positive

fix this false positive:

[    0.020000] ------------[ cut here ]------------
[    0.020000] WARNING: at kernel/lockdep.c:2718 check_flags+0x14a/0x170()
[    0.020000] Modules linked in:
[    0.020000] Pid: 0, comm: swapper Not tainted 2.6.26-tip-00343-gd7e5521-dirty #14486
[    0.020000]  [<c01312e4>] warn_on_slowpath+0x54/0x80
[    0.020000]  [<c067e451>] ? _spin_unlock_irqrestore+0x61/0x70
[    0.020000]  [<c0131bb1>] ? release_console_sem+0x201/0x210
[    0.020000]  [<c0143d65>] ? __kernel_text_address+0x35/0x40
[    0.020000]  [<c010562e>] ? dump_trace+0x5e/0x140
[    0.020000]  [<c01518b5>] ? __lock_acquire+0x245/0x820
[    0.020000]  [<c015063a>] check_flags+0x14a/0x170
[    0.020000]  [<c0151ed8>] ? lock_acquire+0x48/0xc0
[    0.020000]  [<c0151ee1>] lock_acquire+0x51/0xc0
[    0.020000]  [<c014a16c>] ? down+0x2c/0x40
[    0.020000]  [<c010a609>] ? sched_clock+0x9/0x10
[    0.020000]  [<c067e7b2>] _write_lock+0x32/0x60
[    0.020000]  [<c013797f>] ? request_resource+0x1f/0xb0
[    0.020000]  [<c013797f>] request_resource+0x1f/0xb0
[    0.020000]  [<c02f89ad>] vgacon_startup+0x2bd/0x3e0
[    0.020000]  [<c094d62a>] con_init+0x19/0x22f
[    0.020000]  [<c0330c7c>] ? tty_register_ldisc+0x5c/0x70
[    0.020000]  [<c094cf49>] console_init+0x20/0x2e
[    0.020000]  [<c092a969>] start_kernel+0x20c/0x379
[    0.020000]  [<c092a516>] ? unknown_bootoption+0x0/0x1f6
[    0.020000]  [<c092a099>] __init_begin+0x99/0xa1
[    0.020000]  =======================
[    0.020000] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.020000] possible reason: unannotated irqs-on.
[    0.020000] irq event stamp: 0

which occurs if CONFIG_TRACE_IRQFLAGS=y, CONFIG_DEBUG_LOCKDEP=y,
but CONFIG_PROVE_LOCKING is disabled.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

12 years agoMerge commit 'v2.6.26' into core/locking
Ingo Molnar [Mon, 14 Jul 2008 08:31:59 +0000]
Merge commit 'v2.6.26' into core/locking

12 years ago[S390] sclp_tty: Fix scheduling while atomic bug.
Heiko Carstens [Mon, 14 Jul 2008 07:59:46 +0000]
[S390] sclp_tty: Fix scheduling while atomic bug.

Finally fixes a possible scheduling while in atomic context bug. The driver
used to wait on a waitqueue if no empty buffer was available. This could
lead to a deadlock if the driver was called from non-schedulable context.
So fix this. The write operation may fail now. It returns the number of
characters accepted. put_char will never fail, since it writes characters
to an intermediate buffer which gets flushed as soon as it is full.
That means the driver now can busy wait if something is in the intermediate
buffer and a write_string operation follows. Seems to be an acceptable
compromise, since that shouldn't happen too often.

Cc: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>

12 years ago[S390] sclp_tty: remove ioctl interface.
Heiko Carstens [Mon, 14 Jul 2008 07:59:45 +0000]
[S390] sclp_tty: remove ioctl interface.

After all we came to the conclusion that this interface doesn't make any
sense. Besides that the ioctl number used was never registered, the header
file isn't exported, and we doubt there is even a single user.
So remove this interface, since it eases maintenance.

Cc: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>

12 years ago[S390] Remove P390 support.
Heiko Carstens [Mon, 14 Jul 2008 07:59:44 +0000]
[S390] Remove P390 support.

Most likely it is broken anyway because of the changes in memory
detection. Since we can't test it and there are probably better ways
that using a P390 card, remove support for it.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

12 years ago[S390] Cleanup vmcp printk messages.
Christian Borntraeger [Mon, 14 Jul 2008 07:59:43 +0000]
[S390] Cleanup vmcp printk messages.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>

12 years ago[S390] Cleanup lcs printk messages.
Klaus-D. Wacker [Mon, 14 Jul 2008 07:59:42 +0000]
[S390] Cleanup lcs printk messages.

Cc: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Klaus-D. Wacker <kdwacker@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>

12 years ago[S390] Cleanup kprobes printk messages.
Martin Schwidefsky [Mon, 14 Jul 2008 07:59:41 +0000]
[S390] Cleanup kprobes printk messages.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>

12 years ago[S390] Cleanup vmwatch printk messages.
Martin Schwidefsky [Mon, 14 Jul 2008 07:59:40 +0000]
[S390] Cleanup vmwatch printk messages.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>

12 years ago[S390] Cleanup dcssblk printk messages.
Hongjie Yang [Mon, 14 Jul 2008 07:59:39 +0000]
[S390] Cleanup dcssblk printk messages.

Signed-off-by: Hongjie Yang <hongjie@us.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>

12 years ago[S390] Cleanup zfcp dumper printk messages.
Michael Holzheu [Mon, 14 Jul 2008 07:59:38 +0000]
[S390] Cleanup zfcp dumper printk messages.

Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>