[SCSI] zfcp: avoid false ERP complete due to sema race
Swen Schillig [Fri, 17 Apr 2009 13:08:06 +0000 (15:08 +0200)]
The ERP thread is performing a task before it is executing the
corresponding down on the semaphore. The response handler of the
just started exchange config should wait for the completion by
performing a down on this semaphore. Since this semaphore is still
positive from the ERP enqueue the handler won't wait and therefore
the exchange config will always fail leaving the adapter in error.
The problem can be solved by performing the down on the semaphore
before starting an ERP task. This is the logically correct order.
Only walk the ERP loop if there is a task to perform.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

drivers/s390/scsi/zfcp_erp.c

index b73e370..fdc9b43 100644 (file)
@@ -1311,6 +1311,11 @@ static int zfcp_erp_thread(void *data)
 
        while (!(atomic_read(&adapter->status) &
                 ZFCP_STATUS_ADAPTER_ERP_THREAD_KILL)) {
+
+               zfcp_rec_dbf_event_thread_lock("erthrd1", adapter);
+               ignore = down_interruptible(&adapter->erp_ready_sem);
+               zfcp_rec_dbf_event_thread_lock("erthrd2", adapter);
+
                write_lock_irqsave(&adapter->erp_lock, flags);
                next = adapter->erp_ready_head.next;
                write_unlock_irqrestore(&adapter->erp_lock, flags);
@@ -1322,10 +1327,6 @@ static int zfcp_erp_thread(void *data)
                        if (zfcp_erp_strategy(act) != ZFCP_ERP_DISMISSED)
                                zfcp_erp_wakeup(adapter);
                }
-
-               zfcp_rec_dbf_event_thread_lock("erthrd1", adapter);
-               ignore = down_interruptible(&adapter->erp_ready_sem);
-               zfcp_rec_dbf_event_thread_lock("erthrd2", adapter);
        }
 
        atomic_clear_mask(ZFCP_STATUS_ADAPTER_ERP_THREAD_UP, &adapter->status);