Bug 42844 - System frozen if an USB device is connected to the USB 3.0 connector - Dell Vostro 3750
Summary: System frozen if an USB device is connected to the USB 3.0 connector - Dell V...
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Greg Kroah-Hartman
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-03-01 18:29 UTC by WZab
Modified: 2013-08-22 22:49 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.2.9, 3.5.2, 3.5.3, 3.6-rc4, 3.7.1, 3.7.3, 3.8.2, 3.8.5, 3.9.2, 3.10.2, 3.10.7
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Logs described in the bug message (30.67 KB, application/zip)
2012-03-01 18:30 UTC, WZab
Details
Output of the acpidump from the affected system (67.21 KB, application/postscript)
2012-09-03 13:19 UTC, WZab
Details
logs from the boot of the system ending with freeze (7.03 KB, application/x-gzip)
2012-09-03 15:31 UTC, WZab
Details
logs from correct boot (without device connected to USB 3.0 port) (8.87 KB, application/x-gzip)
2012-09-03 15:53 UTC, WZab
Details
Output of acpidump after booting with a device connected to the USB3.0 port with i915.modeset=0 (after booting without such device acpidump returns strictly the same result) (244.91 KB, text/plain)
2013-08-16 20:40 UTC, WZab
Details
The file contains dump of different system parameters after booting with USB 3.0 device and without any USB 3.0 device (4.82 KB, application/x-zip-compressed)
2013-08-16 20:59 UTC, WZab
Details
output of dmesg and of lsmod after booting with an USB in USB3.0 port and without such device (both with i915.modeset=0) (34.97 KB, application/x-zip-compressed)
2013-08-16 21:36 UTC, WZab
Details
Logs from the ACPI when system freezes in the "nosmp" mode (768.36 KB, application/x-zip-compressed)
2013-08-18 09:40 UTC, WZab
Details
Decompiled DSDT table (27.23 KB, application/gzip)
2013-08-18 15:03 UTC, WZab
Details
Found, where info about presence of the USB device survives after BIOS boot - in PCI config space (11.26 KB, application/x-zip-compressed)
2013-08-21 13:52 UTC, WZab
Details

Description WZab 2012-03-01 18:29:04 UTC
I've experienced strange problems with my Dell Vostro 3750 machine.
If I boot it with an USB device connected to one of USB 3.0 enabled ports, the machine freezes during the boot, and the last message displayed is:

 [Firmware Bug]: ACPI(PEGP) defines _DOD but not _DOS

(That's why I fill this bug report against the ACPI subsystem)

However during successful boots this message is also displayed.

I attach four logs obtained in different configurations:

test7_ftdi.log - FTDI converter connected to the USB 3.0 jack - system frozen
test7_ftdi_nonxhci.log - FTDI converter connected to the USB 2.0 jack - system booted, and rebooted normally
test8_disk.log - USB flash disk connected to the USB 3.0 jack - system frozen
test8_disk_nonxhci.log - USB flash disk connected to the USB 2.0 jack - system booted and rebooted normally

(all files are zipped into logs.zip archive, to save space)
Comment 1 WZab 2012-03-01 18:30:12 UTC
Created attachment 72510 [details]
Logs described in the bug message
Comment 2 WZab 2012-03-02 15:19:23 UTC
I have received some suggestions, that the problem may be related to xhci-hcd driver.
However, when I removed the xhci-hcd.ko driver (and updated the initial ramdisk of course) the problem still persisted.
When the FLASH disk is connected to the USB 3.0 enabled port, the machine does not boot. 

If I plug the disk to the same port after machine has booted, system is still working (however the device is not recognized, unless I manually load the xhci-hcd.ko driver).

When I manually load the xhci-hcd driver after the device is plugged in, I get the following messages:

[  603.507239] xhci_hcd 0000:03:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
[  603.507267] xhci_hcd 0000:03:00.0: setting latency timer to 64
[  603.507271] xhci_hcd 0000:03:00.0: xHCI Host Controller
[  603.507282] xhci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 3
[  603.507451] xhci_hcd 0000:03:00.0: irq 18, io mem 0xf2900000
[  603.507518] xhci_hcd 0000:03:00.0: irq 46 for MSI/MSI-X
[  603.507523] xhci_hcd 0000:03:00.0: irq 47 for MSI/MSI-X
[  603.507528] xhci_hcd 0000:03:00.0: irq 48 for MSI/MSI-X
[  603.507532] xhci_hcd 0000:03:00.0: irq 49 for MSI/MSI-X
[  603.507537] xhci_hcd 0000:03:00.0: irq 50 for MSI/MSI-X
[  603.507541] xhci_hcd 0000:03:00.0: irq 51 for MSI/MSI-X
[  603.507547] xhci_hcd 0000:03:00.0: irq 52 for MSI/MSI-X
[  603.507551] xhci_hcd 0000:03:00.0: irq 53 for MSI/MSI-X
[  603.507685] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002
[  603.507687] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[  603.507689] usb usb3: Product: xHCI Host Controller
[  603.507691] usb usb3: Manufacturer: Linux 3.2.9 xhci_hcd
[  603.507692] usb usb3: SerialNumber: 0000:03:00.0
[  603.507813] xHCI xhci_add_endpoint called for root hub
[  603.507815] xHCI xhci_check_bandwidth called for root hub
[  603.507847] hub 3-0:1.0: USB hub found
[  603.507854] hub 3-0:1.0: 2 ports detected
[  603.507920] xhci_hcd 0000:03:00.0: xHCI Host Controller
[  603.507924] xhci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 4
[  603.510773] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003
[  603.510776] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[  603.510778] usb usb4: Product: xHCI Host Controller
[  603.510779] usb usb4: Manufacturer: Linux 3.2.9 xhci_hcd
[  603.510781] usb usb4: SerialNumber: 0000:03:00.0
[  603.510864] xHCI xhci_add_endpoint called for root hub
[  603.510866] xHCI xhci_check_bandwidth called for root hub
[  603.510893] hub 4-0:1.0: USB hub found
[  603.510900] hub 4-0:1.0: 2 ports detected
[  603.808897] usb 3-2: new high-speed USB device number 2 using xhci_hcd
[  603.822521] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  603.823026] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  603.823505] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  603.823622] usb 3-2: New USB device found, idVendor=12d1, idProduct=1446
[  603.823626] usb 3-2: New USB device strings: Mfr=3, Product=2, SerialNumber=0
[  603.823630] usb 3-2: Product: HUAWEI Mobile
[  603.823632] usb 3-2: Manufacturer: HUAWEI Technology
[  603.823779] usb 3-2: ep 0x81 - rounding interval to 32768 microframes, ep desc says 0 microframes
[  603.823785] usb 3-2: ep 0x1 - rounding interval to 32768 microframes, ep desc says 0 microframes
[  603.823790] usb 3-2: ep 0x2 - rounding interval to 32768 microframes, ep desc says 0 microframes
[  603.823795] usb 3-2: ep 0x82 - rounding interval to 32768 microframes, ep desc says 0 microframes
[  603.826641] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  603.827395] scsi16 : usb-storage 3-2:1.0
[  603.827792] scsi17 : usb-storage 3-2:1.1
[  604.391761] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  604.392254] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  604.392784] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  604.393252] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  604.394686] usb 3-2: USB disconnect, device number 2
[  604.394905] xhci_hcd 0000:03:00.0: WARN: transfer error on endpoint
[  608.519167] usb 3-2: new high-speed USB device number 3 using xhci_hcd
[  608.533102] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  608.533599] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  608.534098] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  608.534232] usb 3-2: New USB device found, idVendor=12d1, idProduct=1436
[  608.534237] usb 3-2: New USB device strings: Mfr=4, Product=3, SerialNumber=0
[  608.534240] usb 3-2: Product: HUAWEI Mobile
[  608.534242] usb 3-2: Manufacturer: HUAWEI Technology
[  608.534410] usb 3-2: ep 0x87 - rounding interval to 32768 microframes, ep desc says 0 microframes
[  608.534413] usb 3-2: ep 0x5 - rounding interval to 32768 microframes, ep desc says 0 microframes
[  608.534417] usb 3-2: ep 0x6 - rounding interval to 32768 microframes, ep desc says 0 microframes
[  608.534419] usb 3-2: ep 0x88 - rounding interval to 32768 microframes, ep desc says 0 microframes
[  608.538344] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  608.538932] option 3-2:1.0: GSM modem (1-port) converter detected
[  608.539113] usb 3-2: GSM modem (1-port) converter now attached to ttyUSB0
[  608.541093] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  608.542128] cdc_ether 3-2:1.1: wwan0: register 'cdc_ether' at usb-0000:03:00.0-2, Mobile Broadband Network Device, 02:50:f3:00:00:00
[  608.542382] option 3-2:1.3: GSM modem (1-port) converter detected
[  608.542533] usb 3-2: GSM modem (1-port) converter now attached to ttyUSB1
[  608.542699] option 3-2:1.4: GSM modem (1-port) converter detected
[  608.542844] usb 3-2: GSM modem (1-port) converter now attached to ttyUSB2
[  608.543483] scsi22 : usb-storage 3-2:1.5
[  608.544285] scsi23 : usb-storage 3-2:1.6
[  608.590404] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  608.591151] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.546012] scsi 22:0:0:0: CD-ROM            HUAWEI   Mass Storage     2.31 PQ: 0 ANSI: 2
[  609.548947] scsi 23:0:0:0: Direct-Access     HUAWEI   SD Storage       2.31 PQ: 0 ANSI: 2
[  609.549002] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.550369] sr1: scsi-1 drive
[  609.550641] sr 22:0:0:0: Attached scsi CD-ROM sr1
[  609.550820] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.553350] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.553743] sd 23:0:0:0: [sdb] Attached SCSI removable disk
[  609.558444] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.563097] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.572390] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.576895] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.580257] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.734306] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.736858] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  611.736845] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  611.739127] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  613.732887] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  613.735578] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  614.608158] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  614.608783] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  614.609248] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  614.609781] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  614.611030] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  614.611529] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  614.611997] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  614.612529] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  614.711150] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  614.713771] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  615.416864] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  615.736452] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  615.739216] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
root@WZLap:/lib/modules/3.2.9/kernel/drivers/usb/host# 

anyway the device is operational (both as USB disk, and as UMTS modem).

So to summarize:
The problem is not associated with the xhci_hcd driver. Even without this driver. booting the machine with an USB devices plugged into USB 3.0 capable port causes system to freeze.
The xhci_hcd driver reports some problems associated with communication with the device, but generally device plugged in, after the system is booted, works correctly.
Comment 3 WZab 2012-03-02 17:21:32 UTC
I've upgraded the BIOS in my Dell Vostro to the latest version A11 - it din't help.
I've also tested booting of Windows 7 with the same hardware configuration - no problem occured with USB device connected to either port.
Comment 4 Len Brown 2012-03-06 02:28:27 UTC
re: [Firmware Bug]: ACPI(PEGP) defines _DOD but not _DOS

This is related to the ACPI video driver
You can get rid of the ACPI video driver at boot time this way:
acpi_backlight=vendor
and this message will go away.
(you are welcome to file a bug against ACPI on this topic,
 and attach the output from acpidump, but we may not be able
 to help other than this workaround)

In any case, it has nothing to do with the content of
this bug report, which is specific to USB.
Comment 5 Greg Kroah-Hartman 2012-03-06 04:47:42 UTC
All USB bugs should be sent to the linux-usb@vger.kernel.org mailing
list, and not entered into bugzilla.  Please bring this issue up there,
if it is still a problem in the latest kernel release.
Comment 6 WZab 2012-08-15 23:05:25 UTC
Now, with kernel 3.5.2 I've checked this problem once again.
When booting with device connected to the USB 3.0 port system hangs as previously.
However system starts correctly, when booted with "pci=noacpi" option, and all USB devices are visible.

So it seems, that this is rather an ACPI related bug.
Comment 7 WZab 2012-08-19 11:34:16 UTC
Unfortunately booting with "pci=noacpi" causes some problems:
1. Not all sensors are visible for the "sensors" program
2. Control of the cooling fan is impaired - when CPU temperature increases the fan switches on, and does not switch off when CPU is cooled.
Comment 8 WZab 2012-08-20 08:29:41 UTC
Output of sensors command when booted without "pci=noacpi"
# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1:        +54.0°C  (crit = +100.0°C)
temp2:        +54.0°C  (crit = +100.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +50.0°C  (high = +86.0°C, crit = +100.0°C)
Core 0:         +46.0°C  (high = +86.0°C, crit = +100.0°C)
Core 1:         +49.0°C  (high = +86.0°C, crit = +100.0°C)
Core 2:         +49.0°C  (high = +86.0°C, crit = +100.0°C)
Core 3:         +48.0°C  (high = +86.0°C, crit = +100.0°C)

nouveau-pci-0100
Adapter: PCI adapter
temp1:        +54.0°C  (high = +100.0°C, crit = +110.0°C)

Output of sensors command when booted with "pci=noacpi"
# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1:        +44.0°C  (crit = +100.0°C)
temp2:        +44.0°C  (crit = +100.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +46.0°C  (high = +86.0°C, crit = +100.0°C)
Core 0:         +42.0°C  (high = +86.0°C, crit = +100.0°C)
Core 1:         +44.0°C  (high = +86.0°C, crit = +100.0°C)
Core 2:         +45.0°C  (high = +86.0°C, crit = +100.0°C)
Core 3:         +46.0°C  (high = +86.0°C, crit = +100.0°C)
Comment 9 WZab 2012-09-03 13:19:45 UTC
Created attachment 79071 [details]
Output of the acpidump from the affected system
Comment 10 WZab 2012-09-03 15:31:19 UTC
Created attachment 79111 [details]
logs from the boot of the system ending with freeze

Using the netconsole, I've catched some logs from system start with the connected USB 3.0 device (Huawei modem). The boot finished (as usual) with system freeze.
There are some interesting messages in the system log:
[   70.322826] ACPI Warning: 0x000000000000efa0-0x000000000000efbf SystemIO conflicts with Region \_SB_.PCI0.SBUS.SMBI 1 (20120320/utaddress-251)
[   70.323024] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[   70.478259] agpgart-intel 0000:00:00.0: Intel Sandybridge Chipset
[   70.478613] agpgart-intel 0000:00:00.0: detected gtt size: 2097152K total, 262144K mappable
[   70.480612] agpgart-intel 0000:00:00.0: detected 65536K stolen memory
[   70.481332] agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xe0000000
[   70.506485] ACPI Warning: 0x0000000000000460-0x000000000000047f SystemIO conflicts with Region \PMIO 1 (20120320/utaddress-251)
[   70.506683] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[   70.506806] lpc_ich: Resource conflict(s) found affecting iTCO_wdt
[   70.506883] ACPI Warning: 0x0000000000000428-0x000000000000042f SystemIO conflicts with Region \PMIO 1 (20120320/utaddress-251)
[   70.507069] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[   70.507194] ACPI Warning: 0x0000000000000500-0x000000000000057f SystemIO conflicts with Region \GPIN 1 (20120320/utaddress-251)
[   70.507400] ACPI Warning: 0x0000000000000500-0x000000000000057f SystemIO conflicts with Region \GPIO 2 (20120320/utaddress-251)
[   70.507614] ACPI Warning: 0x0000000000000500-0x000000000000057f SystemIO conflicts with Region \_SB_.PCI0.PEG0.PEGP.GPIO 3 (20120320/utaddress-251)
[   70.507832] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[   70.507925] lpc_ich: Resource conflict(s) found affecting gpio_ich
Comment 11 WZab 2012-09-03 15:53:27 UTC
Created attachment 79121 [details]
logs from correct boot (without device connected to USB 3.0 port)

I've also recorded logs from boot without anything connected to the USB 3.0 connector.
Informations about ACPI conflicts are also present.

When system freezes, the last lines are:

[   71.771678] drm: registered panic notifier
[   71.772957] [Firmware Bug]: ACPI(PEGP) defines _DOD but not _DOS

In successful boot this part of log looks as follows:

[   76.030595] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[   76.080411] fbcon: inteldrmfb (fb1) is primary device
[   76.080432] fbcon: Remapping primary device, fb1, to tty 1-63
[   76.678688] fb1: inteldrmfb frame buffer device
[   76.679837] [Firmware Bug]: ACPI(PEGP) defines _DOD but not _DOS
[   76.687121] acpi device:28: registered as cooling_device8
[   76.687721] ACPI: Video Device [PEGP] (multi-head: yes  rom: yes  post: no)
[   76.688006] input: Video Bus as /devices/LNXSYSTM:00/device:00/PNP0A08:00/device:27/LNXVIDEO:00/input/input10
[   76.708727] acpi device:2b: registered as cooling_device9
[   76.711022] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)

Could it mean, that the problem is associated with access to devices 28 and/or 29?
Comment 12 WZab 2012-09-03 15:56:06 UTC
of course I meant devices 28 and 2b
Comment 13 WZab 2012-09-03 16:42:42 UTC
I have found yet one fact about Dell Vostro, which maybe can explain the problem.
The USB 3.0 controller uses the IRQ 18:

03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) (prog-if 30 [XHCI])
        Subsystem: Dell Device 04c6
        Flags: bus master, fast devsel, latency 0, IRQ 18
        Memory at f2900000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [50] Power Management version 3
        Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
        Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
        Capabilities: [a0] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
        Capabilities: [150] Latency Tolerance Reporting
        Kernel driver in use: xhci_hcd

The SMBus controller uses the same IRQ:

00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 05)
        Subsystem: Dell Device 04c6
        Flags: medium devsel, IRQ 18
        Memory at f2b04000 (64-bit, non-prefetchable) [size=256]
        I/O ports at efa0 [size=32]

Maybe the problem is associated with interrupt handling in the SMBus driver?
If during the boot any device is connected to the USB 3.0 jack, the USB 3.0 controller generates IRQ 18 interrupts, which causes hang in the SMBus driver?
Comment 14 WZab 2012-09-03 17:06:28 UTC
I've simply removed the i2c-core.ko  i2c-dev.ko  i2c-smbus.ko drivers from the /lib/mod/kernel_version/kernel/i2c directory and rebuilt initramfs.
After such modification the system booted.
(However lack of I2C prevented also other modules like i915, nouveau and some others from loading, so it is not clear if the problem is really associated with the interrupt handling in one of i2c-... drivers)
Comment 15 WZab 2012-09-03 17:16:32 UTC
According to http://lxr.free-electrons.com/source/drivers/i2c/i2c-smbus.c#L124

the interrupt service routinre only blocks the IRQ in controller:

124 static irqreturn_t smbalert_irq(int irq, void *d)
125 {
126         struct i2c_smbus_alert *alert = d;
127 
128         /* Disable level-triggered IRQs until we handle them */
129         if (!alert->alert_edge_triggered)
130                 disable_irq_nosync(irq);
131 
132         schedule_work(&alert->alert);
133         return IRQ_HANDLED;
134 }
135 

and wakes up the thread which is supposed to handle it.
In case of interrupt which is shared with the XHCI controller it may probably cause problems.

Am I wrong?
Comment 16 WZab 2012-09-03 21:43:22 UTC
If the real cause of the problem is incorrect sharing of IRQ18 between the SMBus controller and the XHCI USB controler, maybe this could also explain the problems with USB transfers reported in the comment #3 in this bug:

[  603.808897] usb 3-2: new high-speed USB device number 2 using xhci_hcd
[  603.822521] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  603.823026] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  603.823505] xhci_hcd 0000:03:00.0: WARN: short transfer on control ep
[  603.823622] usb 3-2: New USB device found, idVendor=12d1, idProduct=1446
[  603.823626] usb 3-2: New USB device strings: Mfr=3, Product=2,
[...]
[  609.550820] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.553350] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.553743] sd 23:0:0:0: [sdb] Attached SCSI removable disk
[  609.558444] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.563097] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.572390] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint
[  609.576895] xhci_hcd 0000:03:00.0: WARN: Stalled endpoint

So if my diagnosis is correct, the problem is neither USB bug, nor ACPI bug, but the bug in the I2C/SMbus driver.

The smbalert_irq routine should quickly indentify if the interrupt cause is the I2C controller and return either IRQ_HANDLED or IRQ_NONE.
Additionally it shouldn't block the interrupt in the interrupt controller (which can cause unacceptable latency in servicing of USB 3.0 interrupts), but in the I2C controller (is it possible in hardware independent manner?).
Comment 17 WZab 2012-09-04 07:51:38 UTC
Well, in fact in my system the i2c-i801 driver is used.
It offers the "disable_features" parameter, but I can't see a way to disable using of interrupts :-(
Comment 18 WZab 2012-09-04 08:36:03 UTC
I've performed yet a few tests with selective removal of drivers using the I2C interface.
I've found, that if I remove both: i915 and nouveau drivers, the system boots with USB disk connected to USB 3.0 connector.
If ANY of those drivers remains, the system freezes, when booting with USB disk connected to the USB 3.0 connector.

This finding undermines my theory, that problem is associated with sharing of IRQ 18 line, as e.g. both xhcd and i915 driver use the MSI interrupts:

# cat /proc/interrupts
[...]
41:          1          0          0          0          0          0          0          0   PCI-MSI-edge      xhci_hcd
 42:          0          0          0          0          0          0          0          0   PCI-MSI-edge      xhci_hcd
 43:          0          0          0          0          0          0          0          0   PCI-MSI-edge      xhci_hcd
 44:          0          0          0          0          0          0          0          0   PCI-MSI-edge      xhci_hcd
 45:          0          0          0          0          0          0          0          0   PCI-MSI-edge      xhci_hcd
 46:          0          0          0          0          0          0          0          0   PCI-MSI-edge      xhci_hcd
 47:          0          0          0          0          0          0          0          0   PCI-MSI-edge      xhci_hcd
 48:          0          0          0          0          0          0          0          0   PCI-MSI-edge      xhci_hcd
 49:       1487          0          0          0          0          0          0          0   PCI-MSI-edge      eth0
 50:      35106          0          0          0          0          0          0          0   PCI-MSI-edge      ahci
 51:        436          0          0          0          0          0          0          0   PCI-MSI-edge      iwlwifi
 52:        899          0          0          0          0          0          0          0   PCI-MSI-edge      i915
Comment 19 WZab 2013-01-02 21:53:30 UTC
I have upgraded the BIOS in my Dell Vostro 3750 to the newest version A14.
Unfortunately the problem still exists. Even with the newest 3.7.1 kernel.
Comment 20 WZab 2013-01-18 22:33:56 UTC
I have seen, that the 3.7.3 kernel introduced a lot of changes related to xhci, so I've checked if it works with the device (FLASH disc) connected to the USB 3.0 connector.
Unfortunately the problem still exists.
Comment 21 WZab 2013-01-19 23:09:06 UTC
Today I have performed tests suggeste by Jonathan Nieder in http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=644174 and it appeared, that:

1. Vostro 3750 is able to resume from hibernation with a FLASH disk connected to the USB 3.0 port (of course I had to connect the disk after the system started, and then hibernate it).

3. Vostro 3750 is able to start with a FLASH disk connected to the USB 3.0 port if I add "i915.modeset=0" as a kernel parameter, when booting from grub.

Tests were performed with the 3.7.3 kernel.
When booting with the disk inserted into the USB 3.0 port without i915.modeset=0, the system still hangs.
Comment 22 WZab 2013-03-06 19:08:10 UTC
I have just compiled kernel 3.8.2
The problem still persists :-(.
Comment 23 WZab 2013-03-11 19:12:16 UTC
I have discovered one interesting fact regarding this bug.
If I :
1. start my Vostro 3750 with USB device connected to the USB 3.0 port
2. as soon as the grub menu appears I remove this device
3. Select booting Linux

The system hangs, as described in bug report.

So this is not the presence of the device during booting of Linux, which causes system to hang.
This is improper handling of some BIOS data structures happening when one USB 3.0 connector is used during the power-up.
However in the same scenario Windows boots correctly, when selected in point 3.
Comment 24 WZab 2013-08-16 14:54:20 UTC
I've checked the new stable kernel 3.10.7.
The problem is still present.

I've booted the system with "i915.modeset=0" argument. The system boots correctly im this case, but only 1024x768 video mode is available and graphics performance is poor.

Then I have removed the i915 module
"rmmod i915"
When I asked the system to reload the i915 module with 
"modprobe i915 modeset=1" parameter the system hangs as described previously.
Therefore it seems, that the problem is associated with initialization of the i915 module when any device is connected to the USB 3.0 interface.
Comment 25 WZab 2013-08-16 20:19:26 UTC
I've performed yet another experiment.
After booting the kernel with a device connected to the USB 3.0 port and with i915.modeset=0 argument, I have disconnected the device from the USB 3.0 port, and then removed the i915 module and reloaded it with modeset=1.

This also caused system to hang.
So obviously it not the presence of the device in the USB 3.0 port when i915 module which causes initialization routines of i915 to loop forever, but data structures built during the boot...

Maybe when system starts with a device connected to the USB 3.0 port some resources are allocated for that device, and then i915 driver is not able to allocate those resources for itself, and is looping for ever waiting for them?

I'll try to compare the state of the system after booting with i915.modeset=0 in two conditions:
1. With a device connected to USB 3.0 port
2. Without any device connected to the USB 3.0 port.

I'll appreciate any suggestions what is worth to compare (except of results of acpidump of course)...
Comment 26 WZab 2013-08-16 20:40:40 UTC
Created attachment 107223 [details]
Output of acpidump after booting with a device connected to the USB3.0 port with i915.modeset=0 (after booting without such device acpidump returns strictly the same result)
Comment 27 WZab 2013-08-16 20:59:03 UTC
Created attachment 107224 [details]
The file contains dump of different system parameters after booting with USB 3.0 device and without any USB 3.0 device

I've dumped contents of:
/proc/interrupts
/proc/iomem
/proc/ioports

after booting with USB3.0 device and without any USB3.0 device (both with i915.modeset=0).
Unfortunately I can't see any significant difference in system configuration in both cases :-(.
Comment 28 WZab 2013-08-16 21:36:39 UTC
Created attachment 107225 [details]
output of dmesg and of lsmod after booting with an USB in USB3.0 port and without such device (both with i915.modeset=0)

I've dumped the output of "dmesg" and "lsmod" commands after booting with an USB device connected to the USB 3.0 port and without such device.

I suspect, that maybe presence of such device affects the booting order, and maybe the system hangs when some modules are loaded in different order?

The device used for tests was the "Afatech Technologies, Inc. AF9015 DVB-T USB2.0 stick", so of course some drivers are only loaded in case if this device was present...
Comment 29 WZab 2013-08-17 14:01:03 UTC
Today I've checked how long should the USB device connected to the USB 3.0 capable port to trigger the system freeze when oading the i915 driver.

I've found, that even if the device is disconnected right after BIOS asks to enter the power-up password, the system gets frozen.

So the problem is obviousle somehow related to the way the BIOS initializes the system in presence of the USB device in USB 3.0 port.

I'd appreciate any suggestions, how can I further investigate how the state of the system after power-up with USB device differs from the state of the system after power-up without USB device...
-- 
TIA, Wojtek
Comment 30 WZab 2013-08-18 09:40:39 UTC
Created attachment 107235 [details]
Logs from the ACPI when system freezes in the "nosmp" mode

I have started the system with 3.10.2 kernel with ACPI debugging on.
The system was booted with "i915.modeset=0 nosmp" parameters.
I have also started the netconsole.
After the system booted, I have also done:
#rmmod i915
# echo 0x03222207 > /sys/module/acpi/parameters/debug_level
# echo 0xffff0089 > /sys/module/acpi/parameters/debug_layer
# dmesg -n debug
and finally:

#modprobe i915 modeset=1

The system crashed, however I have observed two different behaviours.

In the first (logi3.txt) the system hangs after the following access:
[  484.634332]  exfldio-0243 [838799360] [17] ex_access_region      : ----Entry
[  484.636107]  exfldio-0090 [838799360] [18] ex_setup_region       : ----Entry 00000000
[  484.637864]  exfldio-0214 [838799360] [18] ex_setup_region       : ----Exit- AE_OK
[  484.639770] exregion-0299 [838799360] [19] ex_system_io_space_han: ----Entry
[  484.641511] exregion-0304 [838799360] [19] ex_system_io_space_han: System-IO (width 8) R/W 1 Address=00000000000000B2

In the second (logi4.txt) system loops forever.
Unfortunately I don't know (yet?) how to find the source of the problem using the produced logs...
I'll appreciate any hints...
Comment 31 WZab 2013-08-18 14:39:49 UTC
I have checked if Windows 7 (which successfully boots with USB device connected to the USB 3.0 capable port) patches the DSDT table.

I have extracted the DSDT table in Windows, as described in http://tonycr.wordpress.com/xp_dsdt/ , and I have found, that the DSDT table used by Windows 7 is STRICTLY the same as the DSDT table visible for Linux :-(.

So probably the problem is not associated with the wrong DSDT table, but is Linux-specific...
Comment 32 WZab 2013-08-18 15:03:09 UTC
Created attachment 107236 [details]
Decompiled DSDT table

I attach the decompiled DSDT table from the system exposing the bug.
Comment 33 WZab 2013-08-18 21:51:06 UTC
It seems, that the same problem appears not only in Dell Vostro 3750, but also in laptops of other brands: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/978891
Comment 34 WZab 2013-08-21 13:52:09 UTC
Created attachment 107271 [details]
Found, where info about presence of the USB device survives after BIOS boot - in PCI config space

I have booted the 3.10.2 kernel with parameters:
i915.modeset=0 pci=earlydump in two scenarios, and stored the initial part of the dmesg output:

1. With USB device connected to the USB 3.0 port during power up, and disconnected when BIOS asked for the power-up password
(results in dmesg_start_with_usb30.txt)
2. Without any USB device to the USB 3.0 port all the time
(results in dmesg_start_without_usb30.txt)

I have compared results, and have found, that the contents of the PCI configuration space in these two situations differ.
So this is where information about availability of the USB device in the USB 3.0 port is propagated from the power-up to the booting of the kernel.

What is unclear, is how this difference affects the mode switching in the i915 driver and causes the system freeze...

The difference is here (marked with '!'):

A. in the pci 0000:00:1a.0 config space:

with USB device: 
[    0.000000] pci 0000:00:1a.0 config space:
[    0.000000]   00: 86 80 2d 1c 06 00 90 02 05 20 03 0c 00 00 00 00
[    0.000000]   10: 00 a0 b0 f2 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   20: 00 00 00 00 00 00 00 00 00 00 00 00 28 10 c6 04
[    0.000000]   30: 00 00 00 00 50 00 00 00 00 00 00 00 0b 01 00 00
[    0.000000]   40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   50: 01 58 c2 c9 00 00 00 00 0a 98 a0 20 00 00 00 00
[    0.000000]   60: 20 20 ff 07 00 00 00 00 01 00 00 00 00 00 08 c0
[    0.000000]   70: 00 00 df 3f 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   80: 00 00 80 00 11 88 0c 93 30 0d 00 24 00 00 00 00
[    0.000000]   90: 00 00 00 00 00 00 00 00 13 00 06 03 00 00 00 00
[    0.000000]   a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 04 !40 c0 36
[    0.000000]   f0: 00 00 00 00 88 85 80 00 87 0f 06 08 08 17 5b 20

Without the USB device:
[    0.000000] pci 0000:00:1a.0 config space:
[    0.000000]   00: 86 80 2d 1c 06 00 90 02 05 20 03 0c 00 00 00 00
[    0.000000]   10: 00 a0 b0 f2 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   20: 00 00 00 00 00 00 00 00 00 00 00 00 28 10 c6 04
[    0.000000]   30: 00 00 00 00 50 00 00 00 00 00 00 00 0b 01 00 00
[    0.000000]   40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   50: 01 58 c2 c9 00 00 00 00 0a 98 a0 20 00 00 00 00
[    0.000000]   60: 20 20 ff 07 00 00 00 00 01 00 00 00 00 00 08 c0
[    0.000000]   70: 00 00 df 3f 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   80: 00 00 80 00 11 88 0c 93 30 0d 00 24 00 00 00 00
[    0.000000]   90: 00 00 00 00 00 00 00 00 13 00 06 03 00 00 00 00
[    0.000000]   a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 04 !00 c0 36
[    0.000000]   f0: 00 00 00 00 88 85 80 00 87 0f 06 08 08 17 5b 20

B. in the pci 0000:00:1d.0 config space:

With USB device:
[    0.000000]   00: 86 80 26 1c 06 00 90 02 05 20 03 0c 00 00 00 00
[    0.000000]   10: 00 90 b0 f2 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   20: 00 00 00 00 00 00 00 00 00 00 00 00 28 10 c6 04
[    0.000000]   30: 00 00 00 00 50 00 00 00 00 00 00 00 0b 01 00 00
[    0.000000]   40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   50: 01 58 c2 c9 00 00 00 00 0a 98 a0 20 00 00 00 00
[    0.000000]   60: 20 20 ff 07 00 00 00 00 01 00 00 00 00 00 08 c0
[    0.000000]   70: 00 00 df 3f 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   80: 00 00 80 00 11 88 0c 93 30 0d 00 24 00 00 00 00
[    0.000000]   90: 00 00 00 00 00 00 00 00 13 00 06 03 00 00 00 00
[    0.000000]   a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 !04 !f0 !c1 36
[    0.000000]   f0: 00 00 00 00 88 85 80 00 87 0f 06 08 08 17 5b 20

Without USB device:
[    0.000000] pci 0000:00:1d.0 config space:
[    0.000000]   00: 86 80 26 1c 06 00 90 02 05 20 03 0c 00 00 00 00
[    0.000000]   10: 00 90 b0 f2 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   20: 00 00 00 00 00 00 00 00 00 00 00 00 28 10 c6 04
[    0.000000]   30: 00 00 00 00 50 00 00 00 00 00 00 00 0b 01 00 00
[    0.000000]   40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   50: 01 58 c2 c9 00 00 00 00 0a 98 a0 20 00 00 00 00
[    0.000000]   60: 20 20 ff 07 00 00 00 00 01 00 00 00 00 00 08 c0
[    0.000000]   70: 00 00 df 3f 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   80: 00 00 80 00 11 88 0c 93 30 0d 00 24 00 00 00 00
[    0.000000]   90: 00 00 00 00 00 00 00 00 13 00 06 03 00 00 00 00
[    0.000000]   a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[    0.000000]   c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 !e4 !e4 !c3 36
[    0.000000]   f0: 00 00 00 00 88 85 80 00 87 0f 06 08 08 17 5b 20

The funny thing, is that the affected devices are the USB 2.0 controllers, not the USB 3.0 controller, and the changed field is the "Capabilities pointer"...


00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 05)
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b5)
00:1c.2 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 3 (rev b5)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b5)
00:1c.5 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 6 (rev b5)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation HM67 Express Chipset Family LPC Controller (rev 05)                                           
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller (rev 05)                 
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 05)                                      
01:00.0 VGA compatible controller: NVIDIA Corporation GF108M [GeForce GT 525M] (rev a1)                                             
02:00.0 Network controller: Intel Corporation Centrino Wireless-N 1030 [Rainbow Peak] (rev 34)
03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04)
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)

The affected area is
Comment 35 WZab 2013-08-21 15:23:16 UTC
Hmmm, after some more experimens I have found, that the "Cap. pointer" value varies somehow randomly, so I don't know if it can explain the effect of computer remembering if the device was present before power-up password was entered.....
Comment 36 WZab 2013-08-22 22:49:12 UTC
I've retested all possible workarounds given in this thread, and I have found that the "acpi_backlight=vendor" suggested by Len Brown in https://bugzilla.kernel.org/show_bug.cgi?id=42844#c4 when used with kernels 3.10.2 and 3.10.7 not only removes the annoying message, but also allows my Dell Vostro 3750 to boot correctly with different USB devices (both USB 2.0 and 3.0) connected to the USB 3.0 port.

So I think that this bug may be closed. However the situation where the presence of the USB device in USB 3.0 port during the power-up (even if it is removed immediately after BIOS asks for password) affects the booting of Linux and causes system freeze is very suspicious and may be a symptom of some more serious problems :-(.

I'm sorry for keeping this bug reopened for so long time (probably this workaround was useful even in earlier kernels).

Regards,
Wojtek

Note You need to log in before you can comment on or make changes to this bug.