PCI: fix kernel oops on bridge removal
Kenji Kaneshige [Thu, 26 Mar 2009 07:49:52 +0000 (16:49 +0900)]
Fix the following kernel oops problem that happens when removing PCI
bridge with pciehp loaded. It should also occur with other hotplug
driver that is implemented as a bridge's driver.

[  459.997257] pciehp 0000:2f:04.0:pcie24: unloading service driver pciehp
[  459.997495] general protection fault: 0000 [#1] SMP
[  459.997737] last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/0000:2e:00.0/0000:2f:04.0/remove
[  459.997964] CPU 4
[  459.998129] Modules linked in: pciehp ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc cpufreq_ondemand acpi_cpufreq dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod sbs sbshc battery ac parport_pc lp parport mptspi mptscsih mptbase scsi_transport_spi e1000e sg sr_mod cdrom button serio_raw i2c_i801 i2c_core shpchp pcspkr ata_piix libata megaraid_sas sd_mod scsi_mod crc_t10dif ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
[  459.998129] Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1 PRIMERGY
[  459.998129] RIP: 0010:[<ffffffff803bf047>]  [<ffffffff803bf047>] pci_slot_release+0x37/0x100
[  459.998129] RSP: 0018:ffff88083b3bf9e0  EFLAGS: 00010246
[  459.998129] RAX: ffff88083adc5158 RBX: ffff880836c1bc80 RCX: 6b6b6b6b6b6b6b6b
[  459.998129] RDX: 0000000000000000 RSI: ffffffff803a77f0 RDI: ffff880836c1bc48
[  459.998129] RBP: ffff88083b3bfa00 R08: 0000000000000002 R09: 0000000000000000
[  459.998129] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880836c1bc48
[  459.998129] R13: ffff880836c1bc20 R14: ffff880836c1bc48 R15: ffff880836d1ec38
[  459.998129] FS:  0000000000000000(0000) GS:ffff88083ccc3770(0000) knlGS:0000000000000000
[  459.998129] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  459.998129] CR2: 00007f1562f1d558 CR3: 0000000838090000 CR4: 00000000000006e0
[  459.998129] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  459.998129] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  459.998129] Process events/4 (pid: 56, threadinfo ffff88083b3be000, task ffff88083b3b3e40)
[  459.998129] Stack:
[  459.998129]  ffff880836c1bc80 ffff880836c1bc48 ffffffff80793320 ffff88083b0d0960
[  459.998129]  ffff88083b3bfa30 ffffffff803a788a ffff880836c1bc80 ffffffff803a77f0
[  459.998129]  ffff880836c1bc20 ffff880836d1ec38 ffff88083b3bfa50 ffffffff803a8ce7
[  459.998129] Call Trace:
[  459.998129]  [<ffffffff803a788a>] kobject_release+0x9a/0x290
[  459.998129]  [<ffffffff803a77f0>] ? kobject_release+0x0/0x290
[  459.998129]  [<ffffffff803a8ce7>] kref_put+0x37/0x80
[  459.998129]  [<ffffffff803a76f7>] kobject_put+0x27/0x60
[  459.998129]  [<ffffffff803bebcc>] ? pci_destroy_slot+0x3c/0xc0
[  459.998129]  [<ffffffff803bebd5>] pci_destroy_slot+0x45/0xc0
[  459.998129]  [<ffffffff803c797d>] pci_hp_deregister+0x13d/0x210
[  459.998129]  [<ffffffffa031141d>] cleanup_slots+0x2d/0x80 [pciehp]
[  459.998129]  [<ffffffffa0311735>] pciehp_remove+0x15/0x30 [pciehp]
[  459.998129]  [<ffffffff803c4c99>] pcie_port_remove_service+0x69/0x90
[  459.998129]  [<ffffffff80441da9>] __device_release_driver+0x59/0x90
[  459.998129]  [<ffffffff80441edb>] device_release_driver+0x2b/0x40
[  459.998129]  [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
[  459.998129]  [<ffffffff8043e46b>] device_del+0x12b/0x190
[  459.998129]  [<ffffffff803c4d90>] ? remove_iter+0x0/0x40
[  459.998129]  [<ffffffff8043e4f6>] device_unregister+0x26/0x70
[  459.998129]  [<ffffffff803c4dbf>] remove_iter+0x2f/0x40
[  459.998129]  [<ffffffff8043ddf3>] device_for_each_child+0x33/0x60
[  459.998129]  [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
[  459.998129]  [<ffffffff803c4d30>] pcie_port_device_remove+0x30/0x80
[  459.998129]  [<ffffffff803c55a1>] pcie_portdrv_remove+0x11/0x20
[  459.998129]  [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
[  459.998129]  [<ffffffff80441da9>] __device_release_driver+0x59/0x90
[  459.998129]  [<ffffffff80441edb>] device_release_driver+0x2b/0x40
[  459.998129]  [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
[  459.998129]  [<ffffffff8043e46b>] device_del+0x12b/0x190
[  459.998129]  [<ffffffff8043e4f6>] device_unregister+0x26/0x70
[  459.998129]  [<ffffffff803ba969>] pci_stop_dev+0x49/0x60
[  459.998129]  [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
[  459.998129]  [<ffffffff803c10d9>] remove_callback+0x29/0x40
[  459.998129]  [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
[  459.998129]  [<ffffffff8025769a>] run_workqueue+0x15a/0x230
[  459.998129]  [<ffffffff80257648>] ? run_workqueue+0x108/0x230
[  459.998129]  [<ffffffff8025846f>] worker_thread+0x9f/0x100
[  459.998129]  [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
[  459.998129]  [<ffffffff802583d0>] ? worker_thread+0x0/0x100
[  459.998129]  [<ffffffff8025b89d>] kthread+0x4d/0x80
[  459.998129]  [<ffffffff8020d4ba>] child_rip+0xa/0x20
[  459.998129]  [<ffffffff8020cebc>] ? restore_args+0x0/0x30
[  459.998129]  [<ffffffff8025b850>] ? kthread+0x0/0x80
[  459.998129]  [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
[  459.998129] Code: 56 49 89 fe 41 55 4c 8d 6f d8 41 54 53 74 09 f6 05 b8 05 c7 00 08 75 72 49 8b 45 00 48 8b 48 28 eb 05 66 90 48 89 f1 49 8b 45 00 <48> 8b 31 48 83 c0 28 0f 18 0e 48 39 c1 74 1c 8b 41 38 41 0f b6
[  459.998129] RIP  [<ffffffff803bf047>] pci_slot_release+0x37/0x100
[  459.998129]  RSP <ffff88083b3bf9e0>
[  460.018595] ---[ end trace 5a08d2095374aedc ]---

The pci_remove_bus_device() removes all buses and devices under the
bridge, and then removes the bridge. So the remove() callback of the
hotplug drivers implemented as a bridge's driver is executed after the
struct pci_bus of the bridge's secondary bus is removed. The remove()
callback of those driver unregisters the slot using pci_destroy_slot(),
and slot's release callback refers to the the struct pci_bus that was
already freed. This is the cause of the kernel oops.

This patch solves the problem by stopping bus drivers before removing the
bridge and its child bus and devices.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

drivers/pci/remove.c

index caf8e1e..86503c1 100644 (file)
@@ -95,6 +95,7 @@ EXPORT_SYMBOL(pci_remove_bus);
  */
 void pci_remove_bus_device(struct pci_dev *dev)
 {
+       pci_stop_bus_device(dev);
        if (dev->subordinate) {
                struct pci_bus *b = dev->subordinate;