Created attachment 292875 [details] dmesg Setup: # uname -a Linux odroid 5.9.0-rc6-dirty #1 SMP PREEMPT Tue Oct 6 20:04:44 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux dwc2 driver working in gadget mode on Odroid N2 using function fs kernel: branch next from git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git Scenario: Odroid connects as custom gadget (with mass storage function, see second 114 and 134), then is switched to Android Open Accessory mode and start transmission using function fs (uses ep0,ep1,ep2 channels). Suddenly the USB cable is pulled out (roughly second 122). Running the same sequence of actions (second 134) does not work any more until reboot. In dmesg one can see timeouts. Also, before enabling debug output I could see "HANG! Soft Reset timeout GRSTCTL_CSFTRST_DONE" line in dmesg output (drivers/usb/dwc2/core.c:538) Therefore my conclusion is that there is something wrong with handling device disconnect. Due to number of produced debug messages dmesg output might be incomplete. Please let me know if I can somehow help to fix this bug.
Here is annotated dmesg without debugging info: [ 31.709214] USB_PWR_EN: disabling //here I connect the cable and run my program [ 54.771552] Mass Storage Function, version: 2009/09/11 [ 54.771561] LUN: removable file: (no medium) [ 54.781721] file system registered [ 54.782484] read descriptors [ 54.782495] read strings [ 54.786939] dwc2 ff400000.usb: bound driver configfs-gadget [ 54.976434] dwc2 ff400000.usb: new device is high-speed [ 55.090994] dwc2 ff400000.usb: new device is high-speed [ 55.142436] dwc2 ff400000.usb: new address 4 [ 56.219325] ffs_data_put(): freeing [ 56.219828] unloading //here the switch to AOA mode happens: it is as if the device was disconected and entirely new device (different vid, pid) connected [ 56.257710] file system registered [ 56.258362] read descriptors [ 56.258373] read strings [ 56.262135] dwc2 ff400000.usb: bound driver configfs-gadget [ 56.454364] dwc2 ff400000.usb: new device is high-speed [ 56.568367] dwc2 ff400000.usb: new device is high-speed [ 56.620390] dwc2 ff400000.usb: new address 5 // here the cable is disconnected [ 79.288505] dwc2 ff400000.usb: dwc2_hsotg_ep_stop_xfr: timeout DIEPINT.NAKEFF [ 79.288625] dwc2 ff400000.usb: dwc2_hsotg_ep_stop_xfr: timeout DOEPCTL.EPDisable [ 79.299850] dwc2 ff400000.usb: dwc2_flush_tx_fifo: HANG! AHB Idle GRSCTL [ 79.299982] dwc2 ff400000.usb: dwc2_hsotg_ep_stop_xfr: timeout GINTSTS.GOUTNAKEFF [ 79.300105] dwc2 ff400000.usb: dwc2_hsotg_ep_stop_xfr: timeout DOEPCTL.EPDisable [ 79.307060] ffs_data_put(): freeing [ 79.307355] unloading // rmmod dwc2 && modprobe dwc2 [ 119.388282] dwc2 ff400000.usb: supply vusb_d not found, using dummy regulator [ 119.388349] dwc2 ff400000.usb: supply vusb_a not found, using dummy regulator [ 119.388500] dwc2 ff400000.usb: Bad value for GSNPSID: 0x00000000 // run my program again, as you see it doesn't reach the "bound driver configfs-gadget" line seen above [ 255.814129] Mass Storage Function, version: 2009/09/11 [ 255.814138] LUN: removable file: (no medium) [ 255.814545] file system registered [ 255.815119] read descriptors [ 255.815128] read strings
Hi Tomasz, Could you please disable power optimization by follow workaround and test again: file: params.c static void dwc2_set_param_power_down(struct dwc2_hsotg *hsotg) { int val; if (hsotg->hw_params.hibernation) val = DWC2_POWER_DOWN_PARAM_HIBERNATION; else if (hsotg->hw_params.power_optimized) val = DWC2_POWER_DOWN_PARAM_PARTIAL; else val = DWC2_POWER_DOWN_PARAM_NONE; hsotg->params.power_down = 0; //val; WA }
With this change everything works correctly for me - I can run my program multiple time without rebooting my Odroid between cable disconnects. Thanks a lot. Please let me know if you need any help testing permanent fix for this issue.
We aware about the issues in programming of power saving modes and very soon will commit patch series to fix that.
Any news about this one?
Hi, I'm using RK3308 Rock Pi S, and experiencing a similar issue (though not the same) on v5.14.0-rc2. The mentioned workaround doesn't work. The following testing is done with the workaround applied. When the host is unplugged, the message buffer is bloated with the following message repeatedly: # [ 23.215674] dwc2 ff400000.usb: dwc2_flush_rx_fifo: HANG! AHB Idle GRSCTL [ 23.216448] configfs-gadget gadget: 220 Error! [ 23.231677] dwc2 ff400000.usb: dwc2_flush_rx_fifo: HANG! AHB Idle GRSCTL [ 23.232382] configfs-gadget gadget: 220 Error! [ 23.247524] dwc2 ff400000.usb: dwc2_flush_rx_fifo: HANG! AHB Idle GRSCTL [ 23.263000] dwc2 ff400000.usb: dwc2_flush_rx_fifo: HANG! AHB Idle GRSCTL [ 23.278459] dwc2 ff400000.usb: dwc2_flush_rx_fifo: HANG! AHB Idle GRSCTL And if we plug the cable in again, the HANG message stops, followed by these messages: [ 18.332489] dwc2 ff400000.usb: dwc2_hsotg_ep_stop_xfr: timeout GINTSTS.GOUTNAKEFF [ 18.333378] dwc2 ff400000.usb: dwc2_hsotg_ep_stop_xfr: timeout DOEPCTL.EPDisable [ 18.334265] dwc2 ff400000.usb: dwc2_hsotg_ep_stop_xfr: timeout GINTSTS.GOUTNAKEFF And then the kernel completely freezes, not able to even respond to a key stroke or network ping.
Hi, On 7/24/2021 8:42 PM, bugzilla-daemon@bugzilla.kernel.org wrote: > > https://urldefense.com/v3/__https://bugzilla.kernel.org/show_bug.cgi?id=209555__;!!A4F2R9G_pg!L1C5v0Y0uLaHYGYBeR8HcEuRxTDT44Q7Bw2YABG-ORqKdbDL0sms340fBRmSSSH-Zo7r765Z$ > > Yunhao Tian (t123yh@outlook.com) changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |t123yh@outlook.com > > --- Comment #6 from Yunhao Tian (t123yh@outlook.com) --- > Hi, > > I'm using RK3308 Rock Pi S, and experiencing a similar issue (though not the > same) on v5.14.0-rc2. The mentioned workaround doesn't work. The following > testing is done with the workaround applied. For 5.14-rc2 doesn't need to apply mentioned workaround because the power issue related to 'rmmod dwc2' resolved. > > When the host is unplugged, the message buffer is bloated with the following > message repeatedly: > > # [ 23.215674] dwc2 ff400000.usb: dwc2_flush_rx_fifo: HANG! AHB Idle > GRSCTL > [ 23.216448] configfs-gadget gadget: 220 Error! > [ 23.231677] dwc2 ff400000.usb: dwc2_flush_rx_fifo: HANG! AHB Idle GRSCTL > [ 23.232382] configfs-gadget gadget: 220 Error! > [ 23.247524] dwc2 ff400000.usb: dwc2_flush_rx_fifo: HANG! AHB Idle GRSCTL > [ 23.263000] dwc2 ff400000.usb: dwc2_flush_rx_fifo: HANG! AHB Idle GRSCTL > [ 23.278459] dwc2 ff400000.usb: dwc2_flush_rx_fifo: HANG! AHB Idle GRSCTL > > And if we plug the cable in again, the HANG message stops, followed by these > messages: > > [ 18.332489] dwc2 ff400000.usb: dwc2_hsotg_ep_stop_xfr: timeout > GINTSTS.GOUTNAKEFF > [ 18.333378] dwc2 ff400000.usb: dwc2_hsotg_ep_stop_xfr: timeout > DOEPCTL.EPDisable > [ 18.334265] dwc2 ff400000.usb: dwc2_hsotg_ep_stop_xfr: timeout > GINTSTS.GOUTNAKEFF > > And then the kernel completely freezes, not able to even respond to a key > stroke or network ping. > Could you please apply patch "[PATCH v2] usb: phy: Fix page fault from usb_phy_uevent" from Artur Petrosyan and test again. Thanks, Minas
(In reply to Minas Harutyunyan from comment #7) > Could you please apply patch "[PATCH v2] usb: phy: Fix page fault from > usb_phy_uevent" from Artur Petrosyan and test again. > Hi Minas, Thanks for your reply! It doesn't seem to make a difference with the patch applied. I enabled the dwc2 debug logging option in menuconfig, and captured the logs when I plug in, disconnect and re-plug in. The link to log file is https://drive.google.com/file/d/1ID3bDp4NA6vSXf4AqN8w2WiDhaFbnb59/view?usp=sharing At [ 32.964469] the gadget config was bound to the device; At [ 38.002792] the device was plugged to a PC; At roughly [ 43.063762] the device was disconnected; at [ 45.640378] the device was re-plugged. After a short period of time the kernel freezes, no more logs can be output.
Hi Yunhao, On 7/25/2021 10:55 AM, bugzilla-daemon@bugzilla.kernel.org wrote: > > https://urldefense.com/v3/__https://bugzilla.kernel.org/show_bug.cgi?id=209555__;!!A4F2R9G_pg!L6dlphg0375bx1FF7IECNRnSiIv6e_sCQ5e1ooYvcCkyTte-dEQp2ry8RyopRZNUdnLaQpSh$ > > --- Comment #8 from Yunhao Tian (t123yh@outlook.com) --- > (In reply to Minas Harutyunyan from comment #7) > >> Could you please apply patch "[PATCH v2] usb: phy: Fix page fault from >> usb_phy_uevent" from Artur Petrosyan and test again. >> > > Hi Minas, Thanks for your reply! > > It doesn't seem to make a difference with the patch applied. > > I enabled the dwc2 debug logging option in menuconfig, > and captured the logs when I plug in, disconnect and re-plug in. > > The link to log file is > > https://urldefense.com/v3/__https://drive.google.com/file/d/1ID3bDp4NA6vSXf4AqN8w2WiDhaFbnb59/view?usp=sharing__;!!A4F2R9G_pg!L6dlphg0375bx1FF7IECNRnSiIv6e_sCQ5e1ooYvcCkyTte-dEQp2ry8RyopRZNUdp04G9mX$ > drive.google.com not accessible from my corporate laptop. Could you please put debug log on bugzilla.kernel.org? Thanks, Minas > At [ 32.964469] the gadget config was bound to the device; > At [ 38.002792] the device was plugged to a PC; > At roughly [ 43.063762] the device was disconnected; at [ 45.640378] the > device was re-plugged. > > After a short period of time the kernel freezes, no more logs can be output. >
On 7/25/2021 11:16 AM, Minas Harutyunyan wrote: > Hi Yunhao, > > On 7/25/2021 10:55 AM, bugzilla-daemon@bugzilla.kernel.org wrote: >> >> https://urldefense.com/v3/__https://bugzilla.kernel.org/show_bug.cgi?id=209555__;!!A4F2R9G_pg!L6dlphg0375bx1FF7IECNRnSiIv6e_sCQ5e1ooYvcCkyTte-dEQp2ry8RyopRZNUdnLaQpSh$ >> >> --- Comment #8 from Yunhao Tian (t123yh@outlook.com) --- >> (In reply to Minas Harutyunyan from comment #7) >> >>> Could you please apply patch "[PATCH v2] usb: phy: Fix page fault from >>> usb_phy_uevent" from Artur Petrosyan and test again. >>> >> >> Hi Minas, Thanks for your reply! >> >> It doesn't seem to make a difference with the patch applied. >> >> I enabled the dwc2 debug logging option in menuconfig, >> and captured the logs when I plug in, disconnect and re-plug in. >> >> The link to log file is >> >> https://urldefense.com/v3/__https://drive.google.com/file/d/1ID3bDp4NA6vSXf4AqN8w2WiDhaFbnb59/view?usp=sharing__;!!A4F2R9G_pg!L6dlphg0375bx1FF7IECNRnSiIv6e_sCQ5e1ooYvcCkyTte-dEQp2ry8RyopRZNUdp04G9mX$ >> > > drive.google.com not accessible from my corporate laptop. Could you > please put debug log on bugzilla.kernel.org? > Also please send your params dump. > Thanks, > Minas > >> At [ 32.964469] the gadget config was bound to the device; >> At [ 38.002792] the device was plugged to a PC; >> At roughly [ 43.063762] the device was disconnected; at [ 45.640378] the >> device was re-plugged. >> >> After a short period of time the kernel freezes, no more logs can be output. >> >
Created attachment 298021 [details] Log file of DWC2 gadget failure
Hi Minas, > drive.google.com not accessible from my corporate laptop. Could you > please put debug log on bugzilla.kernel.org? Sorry I didn't notice the attachment function of bugzilla before. The log file is uploaded now. > Also please send your params dump. Could you please explain what is params dump and how can I get it?
On 7/25/2021 11:34 AM, bugzilla-daemon@bugzilla.kernel.org wrote: > > https://urldefense.com/v3/__https://bugzilla.kernel.org/show_bug.cgi?id=209555__;!!A4F2R9G_pg!Nx-KnyqN2h3S5-OJB19NfPlW6zW7ZP5ZImngIaCRUU_MfR7kK7H08493p5E-x6ym7Dxuj9Kd$ > > --- Comment #12 from Yunhao Tian (t123yh@outlook.com) --- > Hi Minas, > >> drive.google.com not accessible from my corporate laptop. Could you >> please put debug log on bugzilla.kernel.org? > > Sorry I didn't notice the attachment function of bugzilla before. The log > file > is uploaded now. > >> Also please send your params dump. > > Could you please explain what is params dump and how can I get it? > cat /sys/kernel/debug/usb/dwc2.2.auto/params cst /sys/kernel/debug/usb/dwc2.2.auto/regdump
Created attachment 298025 [details] Param and reg dump
Hi Minas, Params and regdump is sent as an attachment. I'm unable to get regdump after detach, because the system freezes after disconnecting USB now, with message dwc2_flush_rx_fifo: HANG! AHB Idle GRSCTL flooding.
Hi, I added a dump function and managed to get a regdump after stuck at HANG! AHB Idle GRSCTL. It's attached as dump-after-stuck.txt
Created attachment 298027 [details] Regdump after stuck at HANG
Hi Yunhao, Thank you for provided info. Driver by default set to DDMA mode. In DDMA mode very important when requests producer (function driver) produce more requests that consumer (HW) can process, otherwise BNA (buffer not available) interrupt will be asserted. In your case function driver queued only 2 requests for EP2OUT. Driver create descriptor list with 2 descriptors. On completion of 2nd descriptor core returning to first one, which already completed and not renewed yet. Because of this BNA asserted. For OUT EP's on BNA interrupt RxFIFO should be flushed, but due to AHB not idle - core permanently try to get new descriptor, but no new descriptor created yet. This create infinite loop. So, I suggest you, to allow your function driver produce more requests before first ISOC OUT packets will received in RxFIFO to have enough big pool of descriptors. If you can't control initial requests count from function driver then you can try set core to BDMA mode: g_dma_desc : 1 --> 0 In BDMA mode BNA will not asserted. Thanks, Minas
Hi Minas, I tried to set g_dma_desc to 0, but unfortunately things are still not working. The logs are attached, in case you are interested. As you've said, improving function driver (in this case, uac2) is the way to go. I'll have a try. Thank you very much!
Created attachment 298031 [details] Log after setting g_dma_desc to 0
Hi Yunhao, For uac2 #define UAC2_DEF_REQ_NUM 2 set to i.e. 16 Thanks, Minas
Hi Minas, I changed UAC2_DEF_REQ_NUM to 8, still not working. The log is attached. To make sure this value is actually changed, I added two debug prints in u_audio.c. You can see the two printk's in the attached log. for (i = 0; i < params->req_number; i++) { printk("REQ NUM (c) %d\n", i); // <=== Added if (!prm->reqs[i]) { printk("ALLOC %d\n", i); // <=== Added req = usb_ep_alloc_request(ep, GFP_ATOMIC); The symptom is still the same, getting BNA interrupt. The log is attached.
Created attachment 298043 [details] Log after changing UAC2_DEF_REQ_NUM to 8
Hi - was this ever resolved? I think I am facing the same issue. When I unplug the cable, I see: [ 126.362313] dwc2 ff580000.usb: dwc2_hsotg_ep_stop_xfr: timeout GINTSTS.GOUTNAKEFF [ 126.370999] dwc2 ff580000.usb: dwc2_hsotg_ep_stop_xfr: timeout DOEPCTL.EPDisable [ 126.379585] dwc2 ff580000.usb: dwc2_hsotg_ep_stop_xfr: timeout GINTSTS.GOUTNAKEFF [ 126.388225] dwc2 ff580000.usb: dwc2_hsotg_ep_stop_xfr: timeout DOEPCTL.EPDisable And the machine hangs. Using with g_serial, dr_mode is "peripheral". On 5.15. I have disabled lpm also. Thanks!
Hi remyvarma, Are you using a rockchip device? I copied dwc2 driver from https://github.com/rockchip-linux/kernel/tree/develop-4.19 , replaced the mainline dwc2, did some porting work, and everything starts working. I'm not very sure what's the difference, but it just works.
I am... rk3288. I was thinking about doing the same thing (although it seems a little hacky). I wonder if you wouldn't mind sharing your port for the time being?
You may refer to https://github.com/t123yh/linux/tree/for-rk3308 for a working version of code. Meanwhile I hope that we might able to request some help from @Minas?
Thank you. Yes it would be great to find the root cause of this. I am happy to help debug. Two things I noticed. Is it normal to have two messages like this: [ 54.976434] dwc2 ff400000.usb: new device is high-speed [ 55.090994] dwc2 ff400000.usb: new device is high-speed (I also see the same thing on mine) And also, I think possibly related to this: https://lore.kernel.org/lkml/20200214160149.11681-67-sashal@kernel.org/ I saw DPTXFIFO 9: Size 256, Start 0x00000123 in my debug logs at some point, even with that patch applied. I will reenable debug logging to try to find it
I see this: [ 8.469658] dwc2 ff580000.usb: Core Release: 3.10a (snpsid=4f54310a) [ 8.482710] dwc2 ff580000.usb: dwc2_core_reset() [ 8.488584] dwc2 ff580000.usb: Forcing mode to device [ 8.502210] dwc2 ff580000.usb: Waiting for device mode [ 8.507962] dwc2 ff580000.usb: Device mode set [ 8.512932] dwc2 ff580000.usb: Forcing mode to device [ 8.518577] dwc2 ff580000.usb: Waiting for device mode [ 8.518584] dwc2 ff580000.usb: Device mode set [ 8.518611] dwc2 ff580000.usb: NonPeriodic TXFIFO size: 16 [ 8.518615] dwc2 ff580000.usb: RXFIFO size: 275 [ 8.546968] dwc2 ff580000.usb: EPs: 10, dedicated fifos, 972 entries in SPRAM [ 8.557497] dwc2 ff580000.usb: DCFG=0x08100000, DCTL=0x00000002, DIEPMSK=00000000 [ 8.569534] dwc2 ff580000.usb: GAHBCFG=0x00000000, GHWCFG1=0x00006664 [ 8.576754] dwc2 ff580000.usb: GRXFSIZ=0x00000400, GNPTXFSIZ=0x00100400 [ 8.584152] dwc2 ff580000.usb: DPTx[1] FSize=256, StAddr=0x00000410 [ 8.589476] systemd[1]: Started Journal Service. [ 8.591155] dwc2 ff580000.usb: DPTx[2] FSize=256, StAddr=0x00000900 [ 8.603313] dwc2 ff580000.usb: DPTx[3] FSize=256, StAddr=0x00000a00 [ 8.610314] dwc2 ff580000.usb: DPTx[4] FSize=256, StAddr=0x00000b00 [ 8.617312] dwc2 ff580000.usb: DPTx[5] FSize=256, StAddr=0x00000c00 [ 8.624311] dwc2 ff580000.usb: DPTx[6] FSize=256, StAddr=0x00000d00 [ 8.636845] dwc2 ff580000.usb: DPTx[7] FSize=0, StAddr=0x00000e00 [ 8.643840] dwc2 ff580000.usb: DPTx[8] FSize=0, StAddr=0x00000f00 [ 8.650652] dwc2 ff580000.usb: DPTx[9] FSize=256, StAddr=0x00000410
Hey just as another data point, I believe the odroid has a type c connector. My device also has a type c connector.
Hi, Just discovered the real issue of my case. dwc2_gadget_enter_clock_gating doesn't seem to be working for my RK3308, so disabling clock_gating does the trick for me. Specifically, I added p->no_clock_gating = true; in dwc2_set_rk_params, and everything now works.
Hello, > hsotg->params.power_down = 0; //val; WA This workaround works for me on Odroid C4, linux 5.15.32.
Hi, Just confirming I had the same problem as Yunhao Tian running a fairly vanilla 5.10 kernel on RK3308. Solved for me like above: Patch: p->no_clock_gating = true; into params.c Should definitely be considered for upstreaming as a RK3308 platform specific param for DWC2.