IB/mad: Fix possible lock-lock-timer deadlock
authorRoland Dreier <rolandd@cisco.com>
Mon, 7 Sep 2009 15:27:50 +0000 (08:27 -0700)
committerRoland Dreier <rolandd@cisco.com>
Mon, 7 Sep 2009 15:27:50 +0000 (08:27 -0700)
commit6b2eef8fd78ff909c3396b8671d57c42559cc51d
tree98557140c16bc825a82bfd414fedda46749dbbf7
parent60f2b652f54aa4ac4127a538abad05235fb9c469
IB/mad: Fix possible lock-lock-timer deadlock

Lockdep reported a possible deadlock with cm_id_priv->lock,
mad_agent_priv->lock and mad_agent_priv->timed_work.timer; this
happens because the mad module does

cancel_delayed_work(&mad_agent_priv->timed_work);

while holding mad_agent_priv->lock.  cancel_delayed_work() internally
does del_timer_sync(&mad_agent_priv->timed_work.timer).

This can turn into a deadlock because mad_agent_priv->lock is taken
inside cm_id_priv->lock, so we can get the following set of contexts
that deadlock each other:

 A: holding cm_id_priv->lock, waiting for mad_agent_priv->lock
 B: holding mad_agent_priv->lock, waiting for del_timer_sync()
 C: interrupt during mad_agent_priv->timed_work.timer that takes
    cm_id_priv->lock

Fix this by using the new __cancel_delayed_work() interface (which
internally does del_timer() instead of del_timer_sync()) in all the
places where we are holding a lock.

Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=13757
Reported-by: Bart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
drivers/infiniband/core/mad.c