Created attachment 304742 [details] dmesg outputs, various scenarios Device: Intel NUC (NUC12WSHi5) Thunderbolt Device: https://www.blackmagicdesign.com/products/ultrastudio/techspecs/W-DLUS-13 Tested kernel: 6.4.7 An issue appeared on Intel 12 gen CPU when Thunderbolt device is connected: "timeout enabling lane bonding" Full dmesg is in the attachment as dmesg_100ms file. There is also dmesg_100ms_dyndbg file which is dmesg output with thunderbolt.dyndbg='+p' enabled. Interesting points: - I tried to modify the kernel code (drivers/thunderbolt/switch.c, line 2790) for 10000ms timeout instead of 100ms but the issue persists. However, dmesg without dyndbg writes out more information (dmesg_10000ms file in the attachment zip) - The same device works well with a 10-gen CPU (on NUC10i7FNHN) with the same kernel build and driver version. It also works well on numerous 11-gen NUCs and laptops with older kernel (I never had this issue before) I am open to assistance and it is also possible to borrow Blackmagic UltraStudio from us.
Created attachment 304743 [details] dmesg output kernel 6.5-rc3 The dmesg output with Kernel 6.5-rc3 is again slightly different and I thought that it might be also useful...
Can you check on the working system under /sys/bus/thunderbolt/devices/0-1/rx_lanes and rx_speed (and tx_ similar)?
Sure rx_lanes: 1 rx_speed: 20.0 Gb/s tx_lanes: 1 tx_speed: 20.0 Gb/s
Okay that makes sense, so the connection manager firmware does not bond the lanes (rx/tx_lanes == 1). I'll talk to our firmware people and ask if this is some sort of workaround.
Thanks! So I don't know if this is the right place to discuss it in that case, but if this is a firmware issue and it (hopefully) gets fixed, how could Connection Manager FW be updated? BIOS? New firmware blob in linux-firmware? Do we have any hope?
The connection manager firmware is part of the Thunderbolt firmware and can be upgraded through Linux. However, this is not a firmware issue but Linux issue most likely so any fix will be added to the kernel driver.
*** Bug 217617 has been marked as a duplicate of this bug. ***
Created attachment 304920 [details] Retry with single lane if lane bonding fails (draft) Can you try the attached patch? With USB4 the spec says that after lane bonding fails the link gets disconnect and this is not handled currently properly in the driver. The expectation is that the first attempt fails with lane bonding timeout and the second one succeeds with single lane link. The patch is not yet finalized so there might be some corner cases it is missing but at least this gives us some indication whether this is the right approach.
Perfect! After applying the patch on top of the 6.4.11 stable, I confirm that the UltraStudio devices work well on Intel Alder Lake and AMD Zen3+ USB4. After the initial "timeout enabling lane bonding" and a few seconds the device reconnects successfully and drivers are loaded.
Created attachment 304923 [details] Check that lane 1 is in CL0 I realized that the lane 1 probably is not connected at all on that device. Can you try the attached simpler patch as well? With this there should be lane bonding timeouts as that is not even tried.
Not sure if I should apply this patch on top of the earlier or try this one only
Ok so I applied the patch on top of the previous patch even though you marked the previous one as obsolete. But I guess you needed this piece of information anyway: Indeed, dmesg wrote out "lane 1 is not connected" and the device worked straight away without reconnecting! Just for the record, I had to insert the lines manually as the patch is not compatible with stable (TB_LINK_WIDTH_DUAL is not used in the current stable)
Hi, correct I meant that only apply this patch (and drop the previous). Thanks for checking! I will clean it up, give some more testing and send upstream (probably after merge window closes).
Thanks! This was very helpful!