Bug 217845

Summary: WWAN driver mtk_t7xx prevent suspend/sleep (possibly shutdown) completely and can't connect again once disconnected
Product: Drivers Reporter: Hendri (myperfectmaze)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: NEW ---    
Severity: high CC: bagasdotme, danny, kai.heng.feng, manfred.kitzbichler, tinoucas
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: 6.5.0 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg output

Description Hendri 2023-08-30 13:02:36 UTC
Created attachment 304985 [details]
dmesg output

Hi, I found this issue when I enable mtk_t7xx driver from kernel module, the issue is that this driver prevent the system from sleep/suspend, and on top of that when the Broadband connection is disconnected it will not coming back up.

if I do rmmod mtk_t7xx then trying to reboot/shutdown, it will just stuck on the reboot/shutdown screen until I had to force shutdown my machine by holding power button.
Also for kde desktop it can freeze kde taskbar right after the fail sleep.

Device: Asus Expertbook B7 Flip
Linux kernel version: 6.5.0 (Compiled manually)


dmesg
[   84.326426] mtk_t7xx 0000:57:00.0: [PM] SAP suspend error: -110
[   84.326469] mtk_t7xx 0000:57:00.0: PM: pci_pm_suspend(): t7xx_pci_pm_suspend+0x0/0x20 [mtk_t7xx] returns -110
[   84.326480] mtk_t7xx 0000:57:00.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x170 returns -110
[   84.326487] mtk_t7xx 0000:57:00.0: PM: failed to suspend async: error -110
[   84.326691] PM: Some devices failed to suspend, or early wake event detected
[   84.330377] intel-hid INTC1070:00: failed to get button capability
[   84.847436] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_ops [i915])
[   84.848165] mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
[   84.848621] OOM killer enabled.
[   84.848624] Restarting tasks ... done.
[   84.850826] random: crng reseeded on system resumption
[   84.938572] PM: suspend exit
[   85.043109] e1000e 0000:00:1f.6 eth0: NIC Link is Down
[   86.479012] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
[   86.479017] mtk_t7xx 0000:57:00.0: Write error on AT port, -22
[   86.479139] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
[   86.479143] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
[   86.479143] mtk_t7xx 0000:57:00.0: Write error on MBIM port, -22
[   86.479145] mtk_t7xx 0000:57:00.0: Write error on AT port, -22
[   86.479242] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
[   86.479243] mtk_t7xx 0000:57:00.0: Write error on AT port, -22
[   86.479249] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
[   86.479250] mtk_t7xx 0000:57:00.0: Write error on AT port, -22
[   86.479334] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
[   86.479334] mtk_t7xx 0000:57:00.0: Write error on AT port, -22
[   86.479342] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
[   86.479345] mtk_t7xx 0000:57:00.0: Write error on MBIM port, -22


lspci output
57:00.0 Wireless controller [0d40]: MEDIATEK Corp. Device 4d75 (rev 01)
        Subsystem: Device 1cf8:3500
        Flags: bus master, fast devsel, latency 0, IRQ 17, IOMMU group 18
        Memory at 603d000000 (64-bit, prefetchable) [size=32K]
        Memory at 84800000 (64-bit, non-prefetchable) [size=8M]
        Memory at 603c800000 (64-bit, prefetchable) [size=8M]
        Capabilities: [80] Express Endpoint, MSI 00
        Capabilities: [d0] MSI-X: Enable+ Count=34 Masked-
        Capabilities: [e0] MSI: Enable- Count=1/32 Maskable+ 64bit+
        Capabilities: [f8] Power Management version 3
        Capabilities: [100] Vendor Specific Information: ID=1556 Rev=1 Len=008 <?>
        Capabilities: [108] Latency Tolerance Reporting
        Capabilities: [110] L1 PM Substates
        Capabilities: [200] Advanced Error Reporting
        Capabilities: [300] Secondary PCI Express
        Kernel driver in use: mtk_t7xx
        Kernel modules: mtk_t7xx


mmcli -m 0 output when connected to broadband connection
  -----------------------------------
  General   |                   path: /org/freedesktop/ModemManager1/Modem/0
            |              device id: 274c67ce822dbf30d286b2a796e61c2ffe0eda82
  -----------------------------------
  Hardware  |           manufacturer: generic
            |                  model: MBIM [14C3:4D75]
            |      firmware revision: 81600.0000.00.29.22.05_GC
            |                         E05
            |           h/w revision: V1.0.6
            |              supported: gsm-umts, lte, 5gnr
            |                current: gsm-umts, lte, 5gnr
            |           equipment id: 862146050300084
  -----------------------------------
  System    |                 device: /sys/devices/pci0000:00/0000:00:1c.5/0000:57:00.0
            |                drivers: mtk_t7xx
            |                 plugin: generic
            |           primary port: wwan0mbim0
            |                  ports: wwan0 (net), wwan0at0 (at), wwan0mbim0 (mbim)
  -----------------------------------
  Status    |                   lock: sim-pin2
            |         unlock retries: sim-pin2 (3)
            |                  state: connected
            |            power state: on
            |            access tech: lte
            |         signal quality: 45% (recent)
  -----------------------------------
  Modes     |              supported: allowed: 3g; preferred: none
            |                         allowed: 4g; preferred: none
            |                         allowed: 3g, 4g; preferred: none
            |                         allowed: 5g; preferred: none
            |                         allowed: 3g, 5g; preferred: none
            |                         allowed: 4g, 5g; preferred: none
            |                         allowed: 3g, 4g, 5g; preferred: none
            |                current: allowed: 3g, 4g, 5g; preferred: none
  -----------------------------------
  IP        |              supported: ipv4, ipv6, ipv4v6
  -----------------------------------
  3GPP      |                   imei: 862146050300084
            |          enabled locks: fixed-dialing
            |            operator id: 51010
            |          operator name: xxxx
            |           registration: home
            |   packet service state: attached
  -----------------------------------
  3GPP EPS  |   ue mode of operation: csps-2
            |    initial bearer path: /org/freedesktop/ModemManager1/Bearer/0
            | initial bearer ip type: ipv4v6
  -----------------------------------
  3GPP 5GNR |              mico mode: disabled
  -----------------------------------
  SIM       |       primary sim path: /org/freedesktop/ModemManager1/SIM/0
            |         sim slot paths: slot 1: /org/freedesktop/ModemManager1/SIM/0 (active)
            |                         slot 2: /org/freedesktop/ModemManager1/SIM/1
  -----------------------------------
  Bearer    |                  paths: /org/freedesktop/ModemManager1/Bearer/1


### This is the output of dmesg right after I do systemctl suspend
[  682.953377] mtk_t7xx 0000:57:00.0: [PM] SAP suspend error: -110
[  682.953415] mtk_t7xx 0000:57:00.0: PM: pci_pm_suspend(): t7xx_pci_pm_suspend+0x0/0x20 [mtk_t7xx] returns -110
[  682.953426] mtk_t7xx 0000:57:00.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x170 returns -110
[  682.953432] mtk_t7xx 0000:57:00.0: PM: failed to suspend async: error -110
[  682.953651] PM: Some devices failed to suspend, or early wake event detected
[  682.957050] intel-hid INTC1070:00: failed to get button capability
[  682.960765] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/adlp_guc_70.bin version 70.5.1
[  682.960767] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
[  682.971159] nvme nvme0: Shutdown timeout set to 10 seconds
[  682.974699] nvme nvme0: 16/0/0 default/read/poll queues
[  682.975219] i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads
[  682.976331] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[  682.976332] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[  682.976903] i915 0000:00:02.0: [drm] GT0: GUC: RC enabled
[  683.126105] iwlwifi 0000:00:14.3: WFPM_UMAC_PD_NOTIFICATION: 0x1f
[  683.126116] iwlwifi 0000:00:14.3: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
[  683.126124] iwlwifi 0000:00:14.3: WFPM_AUTH_KEY_0: 0x90
[  683.126133] iwlwifi 0000:00:14.3: CNVI_SCU_SEQ_DATA_DW9: 0x0
[  683.468768] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_ops [i915])
[  683.469386] OOM killer enabled.
[  683.469387] Restarting tasks ... 
[  683.469515] mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
[  683.470749] done.
[  683.470761] random: crng reseeded on system resumption
[  683.555936] PM: suspend exit
[  683.655181] e1000e 0000:00:1f.6 eth0: NIC Link is Down
[  684.049608] iwlwifi 0000:00:14.3: WFPM_UMAC_PD_NOTIFICATION: 0x1f
[  684.049658] iwlwifi 0000:00:14.3: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
[  684.049666] iwlwifi 0000:00:14.3: WFPM_AUTH_KEY_0: 0x90
[  684.049674] iwlwifi 0000:00:14.3: CNVI_SCU_SEQ_DATA_DW9: 0x0
[  685.087717] mtk_t7xx 0000:57:00.0: CLDMA0 queue 2 is not empty
[  690.096629] mtk_t7xx 0000:57:00.0: CLDMA0 queue 2 is not empty

Let me know if more information required for this.
Comment 1 Bagas Sanjaya 2023-08-30 23:57:55 UTC
(In reply to Hendri from comment #0)
> Created attachment 304985 [details]
> dmesg output
> 
> Hi, I found this issue when I enable mtk_t7xx driver from kernel module, the
> issue is that this driver prevent the system from sleep/suspend, and on top
> of that when the Broadband connection is disconnected it will not coming
> back up.
> 
> if I do rmmod mtk_t7xx then trying to reboot/shutdown, it will just stuck on
> the reboot/shutdown screen until I had to force shutdown my machine by
> holding power button.
> Also for kde desktop it can freeze kde taskbar right after the fail sleep.
> 
> Device: Asus Expertbook B7 Flip
> Linux kernel version: 6.5.0 (Compiled manually)
> 
> 
> dmesg
> [   84.326426] mtk_t7xx 0000:57:00.0: [PM] SAP suspend error: -110
> [   84.326469] mtk_t7xx 0000:57:00.0: PM: pci_pm_suspend():
> t7xx_pci_pm_suspend+0x0/0x20 [mtk_t7xx] returns -110
> [   84.326480] mtk_t7xx 0000:57:00.0: PM: dpm_run_callback():
> pci_pm_suspend+0x0/0x170 returns -110
> [   84.326487] mtk_t7xx 0000:57:00.0: PM: failed to suspend async: error -110
> [   84.326691] PM: Some devices failed to suspend, or early wake event
> detected
> [   84.330377] intel-hid INTC1070:00: failed to get button capability
> [   84.847436] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04:
> bound 0000:00:02.0 (ops i915_hdcp_ops [i915])
> [   84.848165] mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1:
> bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
> [   84.848621] OOM killer enabled.
> [   84.848624] Restarting tasks ... done.
> [   84.850826] random: crng reseeded on system resumption
> [   84.938572] PM: suspend exit
> [   85.043109] e1000e 0000:00:1f.6 eth0: NIC Link is Down
> [   86.479012] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
> [   86.479017] mtk_t7xx 0000:57:00.0: Write error on AT port, -22
> [   86.479139] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
> [   86.479143] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
> [   86.479143] mtk_t7xx 0000:57:00.0: Write error on MBIM port, -22
> [   86.479145] mtk_t7xx 0000:57:00.0: Write error on AT port, -22
> [   86.479242] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
> [   86.479243] mtk_t7xx 0000:57:00.0: Write error on AT port, -22
> [   86.479249] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
> [   86.479250] mtk_t7xx 0000:57:00.0: Write error on AT port, -22
> [   86.479334] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
> [   86.479334] mtk_t7xx 0000:57:00.0: Write error on AT port, -22
> [   86.479342] mtk_t7xx 0000:57:00.0: Failed to send skb: -22
> [   86.479345] mtk_t7xx 0000:57:00.0: Write error on MBIM port, -22
> 
> 
> lspci output
> 57:00.0 Wireless controller [0d40]: MEDIATEK Corp. Device 4d75 (rev 01)
>         Subsystem: Device 1cf8:3500
>         Flags: bus master, fast devsel, latency 0, IRQ 17, IOMMU group 18
>         Memory at 603d000000 (64-bit, prefetchable) [size=32K]
>         Memory at 84800000 (64-bit, non-prefetchable) [size=8M]
>         Memory at 603c800000 (64-bit, prefetchable) [size=8M]
>         Capabilities: [80] Express Endpoint, MSI 00
>         Capabilities: [d0] MSI-X: Enable+ Count=34 Masked-
>         Capabilities: [e0] MSI: Enable- Count=1/32 Maskable+ 64bit+
>         Capabilities: [f8] Power Management version 3
>         Capabilities: [100] Vendor Specific Information: ID=1556 Rev=1
> Len=008 <?>
>         Capabilities: [108] Latency Tolerance Reporting
>         Capabilities: [110] L1 PM Substates
>         Capabilities: [200] Advanced Error Reporting
>         Capabilities: [300] Secondary PCI Express
>         Kernel driver in use: mtk_t7xx
>         Kernel modules: mtk_t7xx
> 
> 
> mmcli -m 0 output when connected to broadband connection
>   -----------------------------------
>   General   |                   path: /org/freedesktop/ModemManager1/Modem/0
>             |              device id:
> 274c67ce822dbf30d286b2a796e61c2ffe0eda82
>   -----------------------------------
>   Hardware  |           manufacturer: generic
>             |                  model: MBIM [14C3:4D75]
>             |      firmware revision: 81600.0000.00.29.22.05_GC
>             |                         E05
>             |           h/w revision: V1.0.6
>             |              supported: gsm-umts, lte, 5gnr
>             |                current: gsm-umts, lte, 5gnr
>             |           equipment id: 862146050300084
>   -----------------------------------
>   System    |                 device:
> /sys/devices/pci0000:00/0000:00:1c.5/0000:57:00.0
>             |                drivers: mtk_t7xx
>             |                 plugin: generic
>             |           primary port: wwan0mbim0
>             |                  ports: wwan0 (net), wwan0at0 (at), wwan0mbim0
> (mbim)
>   -----------------------------------
>   Status    |                   lock: sim-pin2
>             |         unlock retries: sim-pin2 (3)
>             |                  state: connected
>             |            power state: on
>             |            access tech: lte
>             |         signal quality: 45% (recent)
>   -----------------------------------
>   Modes     |              supported: allowed: 3g; preferred: none
>             |                         allowed: 4g; preferred: none
>             |                         allowed: 3g, 4g; preferred: none
>             |                         allowed: 5g; preferred: none
>             |                         allowed: 3g, 5g; preferred: none
>             |                         allowed: 4g, 5g; preferred: none
>             |                         allowed: 3g, 4g, 5g; preferred: none
>             |                current: allowed: 3g, 4g, 5g; preferred: none
>   -----------------------------------
>   IP        |              supported: ipv4, ipv6, ipv4v6
>   -----------------------------------
>   3GPP      |                   imei: 862146050300084
>             |          enabled locks: fixed-dialing
>             |            operator id: 51010
>             |          operator name: xxxx
>             |           registration: home
>             |   packet service state: attached
>   -----------------------------------
>   3GPP EPS  |   ue mode of operation: csps-2
>             |    initial bearer path: /org/freedesktop/ModemManager1/Bearer/0
>             | initial bearer ip type: ipv4v6
>   -----------------------------------
>   3GPP 5GNR |              mico mode: disabled
>   -----------------------------------
>   SIM       |       primary sim path: /org/freedesktop/ModemManager1/SIM/0
>             |         sim slot paths: slot 1:
> /org/freedesktop/ModemManager1/SIM/0 (active)
>             |                         slot 2:
> /org/freedesktop/ModemManager1/SIM/1
>   -----------------------------------
>   Bearer    |                  paths: /org/freedesktop/ModemManager1/Bearer/1
> 
> 
> ### This is the output of dmesg right after I do systemctl suspend
> [  682.953377] mtk_t7xx 0000:57:00.0: [PM] SAP suspend error: -110
> [  682.953415] mtk_t7xx 0000:57:00.0: PM: pci_pm_suspend():
> t7xx_pci_pm_suspend+0x0/0x20 [mtk_t7xx] returns -110
> [  682.953426] mtk_t7xx 0000:57:00.0: PM: dpm_run_callback():
> pci_pm_suspend+0x0/0x170 returns -110
> [  682.953432] mtk_t7xx 0000:57:00.0: PM: failed to suspend async: error -110
> [  682.953651] PM: Some devices failed to suspend, or early wake event
> detected
> [  682.957050] intel-hid INTC1070:00: failed to get button capability
> [  682.960765] i915 0000:00:02.0: [drm] GT0: GuC firmware
> i915/adlp_guc_70.bin version 70.5.1
> [  682.960767] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin
> version 7.9.3
> [  682.971159] nvme nvme0: Shutdown timeout set to 10 seconds
> [  682.974699] nvme nvme0: 16/0/0 default/read/poll queues
> [  682.975219] i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all
> workloads
> [  682.976331] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
> [  682.976332] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
> [  682.976903] i915 0000:00:02.0: [drm] GT0: GUC: RC enabled
> [  683.126105] iwlwifi 0000:00:14.3: WFPM_UMAC_PD_NOTIFICATION: 0x1f
> [  683.126116] iwlwifi 0000:00:14.3: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
> [  683.126124] iwlwifi 0000:00:14.3: WFPM_AUTH_KEY_0: 0x90
> [  683.126133] iwlwifi 0000:00:14.3: CNVI_SCU_SEQ_DATA_DW9: 0x0
> [  683.468768] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04:
> bound 0000:00:02.0 (ops i915_hdcp_ops [i915])
> [  683.469386] OOM killer enabled.
> [  683.469387] Restarting tasks ... 
> [  683.469515] mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1:
> bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
> [  683.470749] done.
> [  683.470761] random: crng reseeded on system resumption
> [  683.555936] PM: suspend exit
> [  683.655181] e1000e 0000:00:1f.6 eth0: NIC Link is Down
> [  684.049608] iwlwifi 0000:00:14.3: WFPM_UMAC_PD_NOTIFICATION: 0x1f
> [  684.049658] iwlwifi 0000:00:14.3: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
> [  684.049666] iwlwifi 0000:00:14.3: WFPM_AUTH_KEY_0: 0x90
> [  684.049674] iwlwifi 0000:00:14.3: CNVI_SCU_SEQ_DATA_DW9: 0x0
> [  685.087717] mtk_t7xx 0000:57:00.0: CLDMA0 queue 2 is not empty
> [  690.096629] mtk_t7xx 0000:57:00.0: CLDMA0 queue 2 is not empty
> 
> Let me know if more information required for this.

Does this issue also occurs on older kernels (v6.4, v6.1)?
Comment 2 Hendri 2023-08-31 00:07:34 UTC
(In reply to Bagas Sanjaya from comment #1)

> Does this issue also occurs on older kernels (v6.4, v6.1)?

Yes, this also occurs on kernel v6.4.0, v6.2.0, Ubuntu v6.2.0-26.

I'm not sure about kernel v6.1.
Comment 3 Daniel Suchy 2023-11-12 15:41:39 UTC
I see similar behavior even with recent kernel 6.6.1 using FM350GL modem with following backtrace during mtk_t7xx unload. It doesn't matter if modem  is locked or unlocked (there's FCC unlock procedure needed to have modem fully working). 

Similar backtrace I see even if I force removal of device ( echo 1 > /sys/bus/pci/devices/0000:00:1c.0/0000:08:00.0/remove ), when mtk_t7xx is loaded.



<6>[  769.832876][ T7527] mtk_t7xx 0000:08:00.0: enabling device (0000 -> 0002)
<3>[  769.833886][ T7527] (unnamed net_device) (dummy): netif_napi_add_weight() called with weight 128
<3>[  769.840674][  T143] mtk_t7xx 0000:08:00.0: Port AT is not opened, drop packets
<6>[  769.856013][ T7531] wwan wwan0: port wwan0at0 attached
<6>[  769.856034][ T7531] wwan wwan0: port wwan0mbim0 attached

<6>[  819.761804][ T7482] wwan wwan0: port wwan0at0 disconnected
<6>[  819.762222][ T7482] wwan wwan0: port wwan0mbim0 disconnected
<3>[  820.767001][ T7482] mtk_t7xx 0000:08:00.0: Could not stop CLDMA1 queues
<1>[  821.280691][ T7531] BUG: unable to handle page fault for address: 00000000009c9c58
<1>[  821.280715][ T7531] #PF: supervisor write access in kernel mode
<1>[  821.280723][ T7531] #PF: error_code(0x0002) - not-present page
<6>[  821.280731][ T7531] PGD 0 P4D 0 
<4>[  821.280741][ T7531] Oops: 0002 [#1] PREEMPT SMP NOPTI
<4>[  821.280752][ T7531] CPU: 5 PID: 7531 Comm: t7xx_fsm Not tainted 6.6.1 #1 0a4fd746692a3d23bd8a9388d4826e91599804e0
<4>[  821.280765][ T7531] Hardware name: LENOVO 21BN002QCK/21BN002QCK, BIOS N3CET58W (1.39 ) 09/04/2023
<4>[  821.280771][ T7531] RIP: 0010:queued_spin_lock_slowpath+0x182/0x1c0
<4>[  821.280798][ T7531] Code: f3 90 48 8b 39 48 85 ff 74 f6 eb d9 c1 ef 12 83 e0 03 83 ef 01 48 c1 e0 04 48 63 ff 48 05 40 5b 02 00 48 03 04 fd e0 38 db bd <48> 89 08 8b 41 08 85 c0 75 09 f3 90 8b 41 08 85 c0 74 f7 48 8b 39
<4>[  821.280806][ T7531] RSP: 0018:ffffc9000c1d3e40 EFLAGS: 00010002
<4>[  821.280815][ T7531] RAX: 00000000009c9c58 RBX: 0000000000000202 RCX: ffff88844f565b40
<4>[  821.280823][ T7531] RDX: ffffc9000c36bc00 RSI: 0000000000180000 RDI: 0000000000002f25
<4>[  821.280829][ T7531] RBP: ffffc9000c36bbf8 R08: 0000000000180000 R09: 0000000000000000
<4>[  821.280834][ T7531] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000c36bc08
<4>[  821.280839][ T7531] R13: dead000000000122 R14: ffffc9000c36bc00 R15: ffff88810798ff40
<4>[  821.280844][ T7531] FS:  0000000000000000(0000) GS:ffff88844f540000(0000) knlGS:0000000000000000
<4>[  821.280852][ T7531] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  821.280858][ T7531] CR2: 00000000009c9c58 CR3: 00000001a141c000 CR4: 0000000000750ee0
<4>[  821.280864][ T7531] PKRU: 55555554
<4>[  821.280869][ T7531] Call Trace:
<4>[  821.280876][ T7531]  <TASK>
<4>[  821.280886][ T7531]  ? __die+0x1e/0x70
<4>[  821.280904][ T7531]  ? page_fault_oops+0x14c/0x480
<4>[  821.280919][ T7531]  ? free_unref_page+0xe7/0x190
<4>[  821.280931][ T7531]  ? __free_one_page+0x5b/0x440
<4>[  821.280940][ T7531]  ? exc_page_fault+0x60/0x90
<4>[  821.280951][ T7531]  ? asm_exc_page_fault+0x26/0x30
<4>[  821.280968][ T7531]  ? queued_spin_lock_slowpath+0x182/0x1c0
<4>[  821.280982][ T7531]  _raw_spin_lock_irqsave+0x32/0x50
<4>[  821.280996][ T7531]  complete_all+0x1f/0x80
<4>[  821.281020][ T7531]  fsm_main_thread+0x404/0x790 [mtk_t7xx 6341429adac352b2929940a4d019abdfa3f1d112]
<4>[  821.281067][ T7531]  ? __pfx_autoremove_wake_function+0x10/0x10
<4>[  821.281080][ T7531]  ? __pfx_fsm_main_thread+0x10/0x10 [mtk_t7xx 6341429adac352b2929940a4d019abdfa3f1d112]
<4>[  821.281113][ T7531]  kthread+0xe3/0x110
<4>[  821.281126][ T7531]  ? __pfx_kthread+0x10/0x10
<4>[  821.281137][ T7531]  ret_from_fork+0x2f/0x50
<4>[  821.281152][ T7531]  ? __pfx_kthread+0x10/0x10
<4>[  821.281162][ T7531]  ret_from_fork_asm+0x1b/0x30
<4>[  821.281177][ T7531]  </TASK>
<4>[  821.281181][ T7531] Modules linked in: mtk_t7xx wwan bnep cmac ccm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_hda_codec_hdmi snd_ctl_led snd_hda_codec_realtek snd_hda_codec_generic btusb btrtl btintel btbcm btmtk uvcvideo bluetooth uvc videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev videobuf2_common mc ecdh_generic i2c_designware_platform xt_LOG nf_log_syslog ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_ipv6header ip6t_rt xt_tcpmss i2c_hid_acpi xt_recent ipt_REJECT nf_reject_ipv4 xt_set xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp nft_compat i915 nf_tables libcrc32c iwlmvm ip_set_hash_net ip_set binfmt_misc nfnetlink mac80211 intel_uncore_frequency intel_uncore_frequency_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp nls_cp852 libarc4 ptp coretemp vfat mei_pxp mei_hdcp pps_core drm_buddy fat ttm snd_hda_intel kvm_intel intel_rapl_msr drm_display_helper snd_intel_dspcfg kvm snd_hda_codec cec processor_thermal_device_pci iwlwifi snd_hda_core processor_thermal_device irqbypass
<4>[  821.281343][ T7531]  drm_kms_helper thinkpad_acpi processor_thermal_rfim snd_pcsp snd_hwdep rapl snd_pcm processor_thermal_mbox think_lmi cfg80211 nvram intel_cstate firmware_attributes_class wmi_bmof ucsi_acpi spi_nor ledtrig_audio iTCO_wdt intel_gtt processor_thermal_rapl snd_timer typec_ucsi mei_me intel_pmc_bxt platform_profile intel_uncore snd roles mtd tiny_power_button video mei iTCO_vendor_support soundcore rfkill typec igen6_edac intel_rapl_common i2c_algo_bit button intel_pmc_core int3403_thermal int3400_thermal intel_hid acpi_tad battery ac wmi evdev int340x_thermal_zone acpi_thermal_rel acpi_pad hid_multitouch serio_raw sch_fq msr fuse loop ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 dm_crypt trusted asn1_encoder dm_mod efivarfs hid_generic crct10dif_pclmul ccp crc32_pclmul i2c_designware_core crc32c_intel polyval_clmulni nvme polyval_generic ghash_clmulni_intel nvme_core nvme_common aesni_intel xhci_pci i2c_hid xhci_pci_renesas t10_pi crypto_simd intel_lpss_pci hid xhci_hcd
<4>[  821.281494][ T7531]  crc64_rocksoft_generic intel_lpss drm usbcore thunderbolt i2c_i801 cryptd idma64 crc64_rocksoft psmouse spi_intel_pci i2c_smbus spi_intel virt_dma crc64 thermal usb_common fan agpgart pinctrl_tigerlake [last unloaded: i2c_designware_platform]
<4>[  821.281536][ T7531] CR2: 00000000009c9c58
<4>[  821.281544][ T7531] ---[ end trace 0000000000000000 ]---
Comment 4 Daniel Suchy 2023-11-12 17:39:04 UTC
During playing with modem I also noticed mtk_t7xx module unload ends in kernel panic... maybe this will better point to the problematic code better (something around t7xx_cldma_stop function in the driver).

<6>[ 1655.612020][ T7416] wwan wwan0: port wwan0at0 disconnected
<6>[ 1655.612715][ T7416] wwan wwan0: port wwan0mbim0 disconnected
<3>[ 1656.622405][ T7416] mtk_t7xx 0000:08:00.0: Could not stop CLDMA1 queues
<0>[ 1657.165270][ T7416] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: usleep_range_state+0x8d/0x90
<4>[ 1657.165283][ T7416] CPU: 8 PID: 7416 Comm: rmmod Not tainted 6.6.1 #1 0a4fd746692a3d23bd8a9388d4826e91599804e0
<4>[ 1657.165291][ T7416] Hardware name: LENOVO 21BN002QCK/21BN002QCK, BIOS N3CET58W (1.39 ) 09/04/2023
<4>[ 1657.165295][ T7416] Call Trace:
<4>[ 1657.165300][ T7416]  <TASK>
<4>[ 1657.165307][ T7416]  dump_stack_lvl+0x36/0x50
<4>[ 1657.165314][ T7416]  panic+0x15c/0x310
<4>[ 1657.165324][ T7416]  ? usleep_range_state+0x8d/0x90
<4>[ 1657.165331][ T7416]  __stack_chk_fail+0x14/0x20
<4>[ 1657.165338][ T7416]  usleep_range_state+0x8d/0x90
<4>[ 1657.165351][ T7416]  t7xx_cldma_stop+0x110/0x160 [mtk_t7xx 6341429adac352b2929940a4d019abdfa3f1d112]
<4>[ 1657.165378][ T7416]  t7xx_cldma_exit+0xd/0x60 [mtk_t7xx 6341429adac352b2929940a4d019abdfa3f1d112]
<4>[ 1657.165402][ T7416]  t7xx_md_exit+0x66/0x90 [mtk_t7xx 6341429adac352b2929940a4d019abdfa3f1d112]
<4>[ 1657.165423][ T7416]  t7xx_pci_remove+0x1e/0x60 [mtk_t7xx 6341429adac352b2929940a4d019abdfa3f1d112]
<4>[ 1657.165441][ T7416]  pci_device_remove+0x33/0xa0
<4>[ 1657.165450][ T7416]  device_release_driver_internal+0x1a2/0x210
Panic#1 Part1
<4>[ 1657.165460][ T7416]  driver_detach+0x43/0x90
<4>[ 1657.165466][ T7416]  bus_remove_driver+0x68/0xf0
<4>[ 1657.165475][ T7416]  pci_unregister_driver+0x3a/0x80
<4>[ 1657.165480][ T7416]  __do_sys_delete_module+0x1b1/0x2f0
<4>[ 1657.165486][ T7416]  ? xa_load+0x8b/0xe0
<4>[ 1657.165494][ T7416]  do_syscall_64+0x60/0xc0
<4>[ 1657.165502][ T7416]  ? mntput_no_expire+0x45/0x250
<4>[ 1657.165510][ T7416]  ? __call_rcu_common+0xd9/0x790
<4>[ 1657.165516][ T7416]  ? syscall_exit_to_user_mode+0x2f/0x50
<4>[ 1657.165521][ T7416]  ? do_syscall_64+0x6c/0xc0
<4>[ 1657.165527][ T7416]  ? exc_page_fault+0x60/0x90
<4>[ 1657.165531][ T7416]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
<4>[ 1657.165539][ T7416] RIP: 0033:0x7f2dd75279c7
<4>[ 1657.165545][ T7416] Code: 73 01 c3 48 8b 0d 51 94 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 21 94 0c 00 f7 d8 64 89 01 48
<4>[ 1657.165550][ T7416] RSP: 002b:00007ffd35de8098 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
<4>[ 1657.165556][ T7416] RAX: ffffffffffffffda RBX: 000055732f158740 RCX: 00007f2dd75279c7
<4>[ 1657.165560][ T7416] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055732f1587a8
<4>[ 1657.165563][ T7416] RBP: 0000000000000000 R08: 1999999999999999 R09: 0000000000000000
<4>[ 1657.165566][ T7416] R10: 00007f2dd759aac0 R11: 0000000000000206 R12: 00007ffd35de82f0
<4>[ 1657.165569][ T7416] R13: 000055732f158740 R14: 000055732f1582a0 R15: 0000000000000000
<4>[ 1657.165573][ T7416]  </TASK>
<0>[ 1658.209673][ T7416] Shutting down cpus with NMI
<0>[ 1658.209679][ T7416] Kernel Offset: disabled
Comment 5 Manfred Kitzbichler 2023-12-08 01:27:48 UTC
Same here on Opensuse Tumbleweed with Kernel 6.6.3

From my journalctl output:



Dec 07 23:56:56 Yoda ModemManager[1707]: <info>  [modem1] port 'wwan0mbim0' no longer controllable, reprobing
Dec 07 23:56:56 Yoda kernel: wwan wwan0: port wwan0at0 disconnected
Dec 07 23:56:56 Yoda kernel: wwan wwan0: port wwan0mbim0 disconnected
Dec 07 23:56:56 Yoda NetworkManager[1735]: <info>  [1701993416.8509] device (wwan0mbim0): state change: unavailable -> unmanaged (reason 'removed', sys-iface-state: 'removed')
Dec 07 23:56:56 Yoda ModemManager[1707]: <info>  [base-manager] port wwan0at0 released by device '/sys/devices/pci0000:00/0000:00:1c.0/0000:08:00.0'
Dec 07 23:56:56 Yoda ModemManager[1707]: <info>  [base-manager] port wwan0mbim0 released by device '/sys/devices/pci0000:00/0000:00:1c.0/0000:08:00.0'
Dec 07 23:56:57 Yoda kernel: mtk_t7xx 0000:08:00.0: Could not stop CLDMA1 queues
Dec 07 23:56:58 Yoda kernel: mtk_t7xx 0000:08:00.0: Could not stop CLDMA0 queues
Dec 07 23:56:58 Yoda ModemManager[1707]: <info>  [base-manager] port wwan0 released by device '/sys/devices/pci0000:00/0000:00:1c.0/0000:08:00.0'
...
Dec 07 23:57:58 Yoda kernel: BUG: workqueue lockup - pool cpus=4 node=0 flags=0x0 nice=0 stuck for 51s!
Dec 07 23:57:58 Yoda kernel: Showing busy workqueues and worker pools:
Dec 07 23:57:58 Yoda kernel: workqueue events: flags=0x0
Dec 07 23:57:58 Yoda kernel:   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
Dec 07 23:57:58 Yoda kernel:     pending: drm_fb_helper_damage_work
Dec 07 23:57:58 Yoda kernel: workqueue mm_percpu_wq: flags=0x8
Dec 07 23:57:58 Yoda kernel:   pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 refcnt=3
Dec 07 23:57:58 Yoda kernel:     pending: lru_add_drain_per_cpu BAR(132)
Dec 07 23:57:58 Yoda kernel: Showing backtraces of running workers in stalled CPU-bound worker pools:
Dec 07 23:58:06 Yoda kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
Dec 07 23:58:06 Yoda kernel: rcu:         4-...0: (259 ticks this GP) idle=074c/1/0x4000000000000000 softirq=98474/98475 fqs=3562
Dec 07 23:58:06 Yoda kernel: rcu:                  hardirqs   softirqs   csw/system
Dec 07 23:58:06 Yoda kernel: rcu:          number:        0          0            0
Dec 07 23:58:06 Yoda kernel: rcu:         cputime:        0          0            0   ==> 30004(ms)
Dec 07 23:58:06 Yoda kernel: rcu:         (detected by 15, t=18004 jiffies, g=333349, q=163 ncpus=16)
Dec 07 23:58:06 Yoda kernel: Sending NMI from CPU 15 to CPUs 4:
Dec 07 23:58:06 Yoda kernel: NMI backtrace for cpu 4
Dec 07 23:58:06 Yoda kernel: CPU: 4 PID: 5091 Comm: t7xx_fsm Not tainted 6.6.3-1-default #1 openSUSE Tumbleweed 9031f8c42431aba86b6ec9224e28096b5a78e86b
Dec 07 23:58:06 Yoda kernel: Hardware name: LENOVO 21CES7N400/21CES7N400, BIOS N3AET73W (1.38 ) 05/11/2023
Dec 07 23:58:06 Yoda kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x271/0x2b0
Dec 07 23:58:06 Yoda kernel: Code: c1 ea 12 83 e0 03 83 ea 01 48 c1 e0 05 48 63 d2 48 05 c0 b0 03 00 48 03 04 d5 a0 7c 6a b5 48 89 28 8b 45 08 85 c0 75 09 f3 90 <8b> 45 08 85 c0 74 f7 48 8>
Dec 07 23:58:06 Yoda kernel: RSP: 0018:ffffc90004ef3e20 EFLAGS: 00000046
Dec 07 23:58:06 Yoda kernel: RAX: 0000000000000000 RBX: ffffc9000e653d70 RCX: 0000000000000000
Dec 07 23:58:06 Yoda kernel: RDX: 0000000000000066 RSI: 00000000019da140 RDI: ffffc9000e653d70
Dec 07 23:58:06 Yoda kernel: RBP: ffff88883f23b0c0 R08: ffff888100402a38 R09: ffffffffb5e6f3c0
Dec 07 23:58:06 Yoda kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
Dec 07 23:58:06 Yoda kernel: R13: 0000000000140000 R14: ffffc9000e653d70 R15: ffff88821bd39e40
Dec 07 23:58:06 Yoda kernel: FS:  0000000000000000(0000) GS:ffff88883f200000(0000) knlGS:0000000000000000
Dec 07 23:58:06 Yoda kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 07 23:58:06 Yoda kernel: CR2: 000055b64dad1460 CR3: 0000000507236000 CR4: 0000000000f50ee0
Dec 07 23:58:06 Yoda kernel: PKRU: 55555554
Dec 07 23:58:06 Yoda kernel: Call Trace:
Dec 07 23:58:06 Yoda kernel:  <NMI>
Dec 07 23:58:06 Yoda kernel:  ? nmi_cpu_backtrace+0x99/0x110
Dec 07 23:58:06 Yoda kernel:  ? nmi_cpu_backtrace_handler+0x11/0x20
Dec 07 23:58:06 Yoda kernel:  ? nmi_handle+0x5e/0x150
Dec 07 23:58:06 Yoda kernel:  ? default_do_nmi+0x40/0x100
Dec 07 23:58:06 Yoda kernel:  ? exc_nmi+0x1bd/0x270
Dec 07 23:58:06 Yoda kernel:  ? end_repeat_nmi+0x16/0x67
Dec 07 23:58:06 Yoda kernel:  ? native_queued_spin_lock_slowpath+0x271/0x2b0
Dec 07 23:58:06 Yoda kernel:  ? native_queued_spin_lock_slowpath+0x271/0x2b0
Dec 07 23:58:06 Yoda kernel:  ? native_queued_spin_lock_slowpath+0x271/0x2b0
Dec 07 23:58:06 Yoda kernel:  </NMI>
Dec 07 23:58:06 Yoda kernel:  <TASK>
Dec 07 23:58:06 Yoda kernel:  _raw_spin_lock_irqsave+0x3d/0x50
Dec 07 23:58:06 Yoda kernel:  complete_all+0x24/0x90
Dec 07 23:58:06 Yoda kernel:  fsm_main_thread+0x404/0x790 [mtk_t7xx 187043e0e3556a04be6cfada7d7b6986a9d50969]
Dec 07 23:58:06 Yoda kernel:  ? __pfx_autoremove_wake_function+0x10/0x10
Dec 07 23:58:06 Yoda kernel:  ? __pfx_fsm_main_thread+0x10/0x10 [mtk_t7xx 187043e0e3556a04be6cfada7d7b6986a9d50969]
Dec 07 23:58:06 Yoda kernel:  kthread+0xe5/0x120
Dec 07 23:58:06 Yoda kernel:  ? __pfx_kthread+0x10/0x10
Dec 07 23:58:06 Yoda kernel:  ret_from_fork+0x31/0x50
Dec 07 23:58:06 Yoda kernel:  ? __pfx_kthread+0x10/0x10
Dec 07 23:58:06 Yoda kernel:  ret_from_fork_asm+0x1b/0x30
Dec 07 23:58:06 Yoda kernel:  </TASK>