Hi, Today I try this version kernel 5.17.0-next(vanilla). commit f022814633e1c600507b3a99691b4d624c2813f0 (grafted, HEAD -> master, origin/master, origin/HEAD) Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Sat Mar 26 14:54:41 2022 -0700 Merge tag 'trace-v5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull trace event string verifier fix from Steven Rostedt: "The run-time string verifier checks all trace event formats as they are read from the tracing file to make sure that the %s pointers are not reading something that no longer exists. However, it failed to account for the valid case of '%*.s' where the length given is zero, and the string is NULL. It incorrectly flagged it as a null pointer dereference and gave a WARN_ON()" * tag 'trace-v5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Have trace event string test handle zero length strings nvme0: Admin Cmd(0x6), I/O Error (sct 0x0 / sc 0x2) dmesg | grep -E "nvme|nvme0" [ 1.457145] nvme 0000:03:00.0: platform quirk: setting simple suspend [ 1.457179] nvme nvme0: pci function 0000:03:00.0 [ 1.534265] nvme0: Admin Cmd(0x6), I/O Error (sct 0x0 / sc 0x2) [ 1.541260] nvme nvme0: 8/0/0 default/read/poll queues [ 1.542478] nvme0n1: p1 p2 p3 p4 [ 2.991708] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Quota mode: none. [ 4.910626] Adding 8388604k swap on /dev/nvme0n1p3. Priority:-2 extents:1 across:8388604k SSFS [ 4.915648] EXT4-fs (nvme0n1p2): re-mounted. Quota mode: none. [ 5.116574] EXT4-fs (nvme0n1p4): mounted filesystem with ordered data mode. Quota mode: none. spci 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne Root Complex 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne IOMMU 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge 00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge 00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge 00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge 00:02.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge 00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 51) 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51) 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 0 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 1 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 2 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 3 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 4 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 5 00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 6 00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 7 01:00.0 VGA compatible controller: NVIDIA Corporation GA106M [GeForce RTX 3060 Mobile / Max-Q] (rev a1) 01:00.1 Audio device: NVIDIA Corporation Device 228e (rev a1) 02:00.0 Network controller: MEDIATEK Corp. MT7921 802.11ax PCI Express Wireless Network Adapter 03:00.0 Non-Volatile memory controller: Intel Corporation Device f1aa (rev 03) 04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne (rev c4) 04:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Renoir Radeon High Definition Audio Controller 04:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor 04:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1 04:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1 04:00.5 Multimedia controller: Advanced Micro Devices, Inc. [AMD] Raven/Raven2/FireFlight/Renoir Audio Processor (rev 01) 04:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) HD Audio Controller INTEL SSDPEKNU010TZ
Created attachment 300627 [details] dmesg 5.17.0-next
It's harmless. The driver is querying to see if the controller supports a particular mode of Identification. The NVMe spec doesn't provide a great way for a driver to know ahead of time if a controller supports an optional identification or not so the driver just has to try it to find out. There is a TP in the workgroup that may improve that situation, but that may take some time to publish and for devices to implement it.
I have the same error but with Kernel 5.19.0 from Ubuntu 22.04 (KDE Neon User edition) $ sysctl fs.inotify fs.inotify.max_queued_events = 32768 fs.inotify.max_user_instances = 8192 fs.inotify.max_user_watches = 1048576 $ free -h total used free shared buff/cache available Mem: 62Gi 12Gi 37Gi 443Mi 12Gi 49Gi Swap: 0B 0B 0B $ uname -a Linux fabio-pc 5.19.0-43-generic #44~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon May 22 13:39:36 UTC 2 x86_64 x86_64 x86_64 GNU/Linux $ sudo dmesg | grep nvme1 [ 1.461384] nvme nvme1: pci function 0000:04:00.0 [ 1.485117] nvme nvme1: 15/0/0 default/read/poll queues [ 1.503202] nvme1n1: p1 p2 p4 p5 [ 4.183196] EXT4-fs (nvme1n1p2): mounted filesystem with ordered data mode. Quota mode: none. [ 4.525988] EXT4-fs (nvme1n1p2): re-mounted. Quota mode: none. [ 4.814331] Adding 67583996k swap on /dev/nvme1n1p5. Priority:-2 extents:1 across:67583996k SSFS [ 592.844663] nvme nvme1: I/O 640 QID 7 timeout, aborting [ 592.844674] nvme nvme1: I/O 641 QID 7 timeout, aborting [ 592.844678] nvme nvme1: I/O 642 QID 7 timeout, aborting [ 592.844682] nvme nvme1: I/O 643 QID 7 timeout, aborting [ 592.844686] nvme nvme1: I/O 644 QID 7 timeout, aborting [ 622.861066] nvme nvme1: I/O 640 QID 7 timeout, reset controller [ 654.797304] nvme nvme1: I/O 24 QID 0 timeout, reset controller [ 683.493736] nvme1n1: I/O Cmd(0x2) @ LBA 1318392128, 256 blocks, I/O Error (sct 0x3 / sc 0x71) [ 683.493747] I/O error, dev nvme1n1, sector 1318392128 op 0x0:(READ) flags 0x80700 phys_seg 32 prio class 0 [ 683.493767] nvme nvme1: Abort status: 0x371 [ 683.493770] nvme nvme1: Abort status: 0x371 [ 683.493773] nvme nvme1: Abort status: 0x371 [ 683.493774] nvme nvme1: Abort status: 0x371 [ 683.493776] nvme nvme1: Abort status: 0x371 [ 683.535306] nvme nvme1: 15/0/0 default/read/poll queues [ 1002.955149] nvme nvme1: I/O 0 QID 6 timeout, aborting [ 1002.955162] nvme nvme1: I/O 1 QID 6 timeout, aborting [ 1002.955168] nvme nvme1: I/O 832 QID 6 timeout, aborting [ 1002.955172] nvme nvme1: I/O 833 QID 6 timeout, aborting [ 1002.955177] nvme nvme1: I/O 834 QID 6 timeout, aborting [ 1033.674738] nvme nvme1: I/O 0 QID 6 timeout, reset controller [ 1064.393291] nvme nvme1: I/O 16 QID 0 timeout, reset controller [ 1095.144296] nvme1n1: I/O Cmd(0x2) @ LBA 566100072, 256 blocks, I/O Error (sct 0x3 / sc 0x71) [ 1095.144307] I/O error, dev nvme1n1, sector 566100072 op 0x0:(READ) flags 0x80700 phys_seg 32 prio class 0 [ 1095.144363] nvme nvme1: Abort status: 0x371 [ 1095.144367] nvme nvme1: Abort status: 0x371 [ 1095.144369] nvme nvme1: Abort status: 0x371 [ 1095.144371] nvme nvme1: Abort status: 0x371 [ 1095.144373] nvme nvme1: Abort status: 0x371 [ 1095.202389] nvme nvme1: 15/0/0 default/read/poll queues [ 2375.079668] nvme nvme1: I/O 706 QID 2 timeout, aborting [ 2375.079683] nvme nvme1: I/O 707 QID 2 timeout, aborting [ 2375.079689] nvme nvme1: I/O 708 QID 2 timeout, aborting [ 2375.079695] nvme nvme1: I/O 706 QID 3 timeout, aborting [ 2375.079700] nvme nvme1: I/O 708 QID 3 timeout, aborting [ 2405.798811] nvme nvme1: I/O 706 QID 2 timeout, reset controller [ 2436.517961] nvme nvme1: I/O 20 QID 0 timeout, reset controller [ 2467.257607] nvme nvme1: Abort status: 0x371 [ 2467.257612] nvme nvme1: Abort status: 0x371 [ 2467.257615] nvme nvme1: Abort status: 0x371 [ 2467.257616] nvme nvme1: Abort status: 0x371 [ 2467.257618] nvme nvme1: Abort status: 0x371 [ 2467.297335] nvme nvme1: 15/0/0 default/read/poll queues [ 2500.004300] nvme nvme1: I/O 709 QID 3 timeout, aborting [ 2500.004315] nvme nvme1: I/O 711 QID 3 timeout, aborting [ 2500.004321] nvme nvme1: I/O 64 QID 5 timeout, aborting [ 2500.004328] nvme nvme1: I/O 576 QID 9 timeout, aborting [ 2500.004333] nvme nvme1: I/O 577 QID 9 timeout, aborting [ 2530.723505] nvme nvme1: I/O 709 QID 3 timeout, reset controller [ 2561.442725] nvme nvme1: I/O 20 QID 0 timeout, reset controller [ 2592.186195] nvme1n1: I/O Cmd(0x2) @ LBA 1423860112, 256 blocks, I/O Error (sct 0x3 / sc 0x71) [ 2592.186202] I/O error, dev nvme1n1, sector 1423860112 op 0x0:(READ) flags 0x80700 phys_seg 20 prio class 0 [ 2592.186227] nvme nvme1: Abort status: 0x371 [ 2592.186228] nvme nvme1: Abort status: 0x371 [ 2592.186229] nvme nvme1: Abort status: 0x371 [ 2592.186230] nvme nvme1: Abort status: 0x371 [ 2592.186231] nvme nvme1: Abort status: 0x371 [ 2592.226978] nvme nvme1: 15/0/0 default/read/poll queues [ 2721.182884] nvme nvme1: I/O 0 QID 4 timeout, aborting [ 2721.182899] nvme nvme1: I/O 1 QID 4 timeout, aborting [ 2721.182904] nvme nvme1: I/O 2 QID 4 timeout, aborting [ 2721.182909] nvme nvme1: I/O 3 QID 4 timeout, aborting [ 2721.182914] nvme nvme1: I/O 4 QID 4 timeout, aborting [ 2751.902167] nvme nvme1: I/O 0 QID 4 timeout, reset controller [ 2782.621470] nvme nvme1: I/O 15 QID 0 timeout, reset controller [ 2813.373247] nvme nvme1: Abort status: 0x371 [ 2813.373251] nvme nvme1: Abort status: 0x371 [ 2813.373253] nvme nvme1: Abort status: 0x371 [ 2813.373255] nvme nvme1: Abort status: 0x371 [ 2813.373256] nvme nvme1: Abort status: 0x371 [ 2813.420857] nvme nvme1: 15/0/0 default/read/poll queues $ sudo smartctl -a /dev/nvme1 smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.19.0-43-generic] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: ADATA SX8200PNP Serial Number: 2K342LQHNHJ2 Firmware Version: 42B2S7JA PCI Vendor/Subsystem ID: 0x1cc1 IEEE OUI Identifier: 0x000000 Controller ID: 1 NVMe Version: 1.3 Number of Namespaces: 1 Namespace 1 Size/Capacity: 1.024.209.543.168 [1,02 TB] Namespace 1 Utilization: 892.418.854.912 [892 GB] Namespace 1 Formatted LBA Size: 512 Local Time is: Sat Jun 3 13:06:09 2023 -03 Firmware Updates (0x14): 2 Slots, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 64 Pages Warning Comp. Temp. Threshold: 75 Celsius Critical Comp. Temp. Threshold: 80 Celsius Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 9.00W - - 0 0 0 0 0 0 1 + 4.60W - - 1 1 1 1 0 0 2 + 3.80W - - 2 2 2 2 0 0 3 - 0.0450W - - 3 3 3 3 2000 2000 4 - 0.0040W - - 4 4 4 4 15000 15000 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 39 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 10% Data Units Read: 28.790.609 [14,7 TB] Data Units Written: 98.726.571 [50,5 TB] Host Read Commands: 605.778.838 Host Write Commands: 1.588.032.646 Controller Busy Time: 21.672 Power Cycles: 1.188 Power On Hours: 9.520 Unsafe Shutdowns: 151 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Error Information (NVMe Log 0x01, 16 of 256 entries) No Errors Logged
Additionally, my system freezes for like 40 seconds or more and only comes back to work after I press CTRL + ALT + F2, wait a few seconds, and press CTRL + ALT + F1. I am trying to identify if the problem is caused by some misconfiguration in the Kernel, or by the SSD controller or SSD NAND malfunction (The ADATA SSD Toolbox on Windows 10 does not report any errors.), or by some power setting as described in https://wiki.archlinux.org/title/Solid_state_drive/NVMe#Controller_failure_due_to_broken_APST_support or any NVIDIA official driver misbehavior, some problem on X, or any excessive use of inodes from Jetbrains Pycharm or Android Studio IDE's If anyone knows how I can get more useful logs to help figure out the cause of the problem, I would appreciate it.