Bug 215763
Summary: | nvme0: Admin Cmd(0x6), I/O Error (sct 0x0 / sc 0x2) | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | sander44 (ionut_n2001) |
Component: | NVMe | Assignee: | IO/NVME Virtual Default Assignee (io_nvme) |
Status: | NEW --- | ||
Severity: | high | CC: | bnafta, icegood1980, kbusch |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 5.17.0-next | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | dmesg 5.17.0-next |
Description
sander44
2022-03-27 17:05:01 UTC
Created attachment 300627 [details]
dmesg 5.17.0-next
It's harmless. The driver is querying to see if the controller supports a particular mode of Identification. The NVMe spec doesn't provide a great way for a driver to know ahead of time if a controller supports an optional identification or not so the driver just has to try it to find out. There is a TP in the workgroup that may improve that situation, but that may take some time to publish and for devices to implement it. I have the same error but with Kernel 5.19.0 from Ubuntu 22.04 (KDE Neon User edition) $ sysctl fs.inotify fs.inotify.max_queued_events = 32768 fs.inotify.max_user_instances = 8192 fs.inotify.max_user_watches = 1048576 $ free -h total used free shared buff/cache available Mem: 62Gi 12Gi 37Gi 443Mi 12Gi 49Gi Swap: 0B 0B 0B $ uname -a Linux fabio-pc 5.19.0-43-generic #44~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon May 22 13:39:36 UTC 2 x86_64 x86_64 x86_64 GNU/Linux $ sudo dmesg | grep nvme1 [ 1.461384] nvme nvme1: pci function 0000:04:00.0 [ 1.485117] nvme nvme1: 15/0/0 default/read/poll queues [ 1.503202] nvme1n1: p1 p2 p4 p5 [ 4.183196] EXT4-fs (nvme1n1p2): mounted filesystem with ordered data mode. Quota mode: none. [ 4.525988] EXT4-fs (nvme1n1p2): re-mounted. Quota mode: none. [ 4.814331] Adding 67583996k swap on /dev/nvme1n1p5. Priority:-2 extents:1 across:67583996k SSFS [ 592.844663] nvme nvme1: I/O 640 QID 7 timeout, aborting [ 592.844674] nvme nvme1: I/O 641 QID 7 timeout, aborting [ 592.844678] nvme nvme1: I/O 642 QID 7 timeout, aborting [ 592.844682] nvme nvme1: I/O 643 QID 7 timeout, aborting [ 592.844686] nvme nvme1: I/O 644 QID 7 timeout, aborting [ 622.861066] nvme nvme1: I/O 640 QID 7 timeout, reset controller [ 654.797304] nvme nvme1: I/O 24 QID 0 timeout, reset controller [ 683.493736] nvme1n1: I/O Cmd(0x2) @ LBA 1318392128, 256 blocks, I/O Error (sct 0x3 / sc 0x71) [ 683.493747] I/O error, dev nvme1n1, sector 1318392128 op 0x0:(READ) flags 0x80700 phys_seg 32 prio class 0 [ 683.493767] nvme nvme1: Abort status: 0x371 [ 683.493770] nvme nvme1: Abort status: 0x371 [ 683.493773] nvme nvme1: Abort status: 0x371 [ 683.493774] nvme nvme1: Abort status: 0x371 [ 683.493776] nvme nvme1: Abort status: 0x371 [ 683.535306] nvme nvme1: 15/0/0 default/read/poll queues [ 1002.955149] nvme nvme1: I/O 0 QID 6 timeout, aborting [ 1002.955162] nvme nvme1: I/O 1 QID 6 timeout, aborting [ 1002.955168] nvme nvme1: I/O 832 QID 6 timeout, aborting [ 1002.955172] nvme nvme1: I/O 833 QID 6 timeout, aborting [ 1002.955177] nvme nvme1: I/O 834 QID 6 timeout, aborting [ 1033.674738] nvme nvme1: I/O 0 QID 6 timeout, reset controller [ 1064.393291] nvme nvme1: I/O 16 QID 0 timeout, reset controller [ 1095.144296] nvme1n1: I/O Cmd(0x2) @ LBA 566100072, 256 blocks, I/O Error (sct 0x3 / sc 0x71) [ 1095.144307] I/O error, dev nvme1n1, sector 566100072 op 0x0:(READ) flags 0x80700 phys_seg 32 prio class 0 [ 1095.144363] nvme nvme1: Abort status: 0x371 [ 1095.144367] nvme nvme1: Abort status: 0x371 [ 1095.144369] nvme nvme1: Abort status: 0x371 [ 1095.144371] nvme nvme1: Abort status: 0x371 [ 1095.144373] nvme nvme1: Abort status: 0x371 [ 1095.202389] nvme nvme1: 15/0/0 default/read/poll queues [ 2375.079668] nvme nvme1: I/O 706 QID 2 timeout, aborting [ 2375.079683] nvme nvme1: I/O 707 QID 2 timeout, aborting [ 2375.079689] nvme nvme1: I/O 708 QID 2 timeout, aborting [ 2375.079695] nvme nvme1: I/O 706 QID 3 timeout, aborting [ 2375.079700] nvme nvme1: I/O 708 QID 3 timeout, aborting [ 2405.798811] nvme nvme1: I/O 706 QID 2 timeout, reset controller [ 2436.517961] nvme nvme1: I/O 20 QID 0 timeout, reset controller [ 2467.257607] nvme nvme1: Abort status: 0x371 [ 2467.257612] nvme nvme1: Abort status: 0x371 [ 2467.257615] nvme nvme1: Abort status: 0x371 [ 2467.257616] nvme nvme1: Abort status: 0x371 [ 2467.257618] nvme nvme1: Abort status: 0x371 [ 2467.297335] nvme nvme1: 15/0/0 default/read/poll queues [ 2500.004300] nvme nvme1: I/O 709 QID 3 timeout, aborting [ 2500.004315] nvme nvme1: I/O 711 QID 3 timeout, aborting [ 2500.004321] nvme nvme1: I/O 64 QID 5 timeout, aborting [ 2500.004328] nvme nvme1: I/O 576 QID 9 timeout, aborting [ 2500.004333] nvme nvme1: I/O 577 QID 9 timeout, aborting [ 2530.723505] nvme nvme1: I/O 709 QID 3 timeout, reset controller [ 2561.442725] nvme nvme1: I/O 20 QID 0 timeout, reset controller [ 2592.186195] nvme1n1: I/O Cmd(0x2) @ LBA 1423860112, 256 blocks, I/O Error (sct 0x3 / sc 0x71) [ 2592.186202] I/O error, dev nvme1n1, sector 1423860112 op 0x0:(READ) flags 0x80700 phys_seg 20 prio class 0 [ 2592.186227] nvme nvme1: Abort status: 0x371 [ 2592.186228] nvme nvme1: Abort status: 0x371 [ 2592.186229] nvme nvme1: Abort status: 0x371 [ 2592.186230] nvme nvme1: Abort status: 0x371 [ 2592.186231] nvme nvme1: Abort status: 0x371 [ 2592.226978] nvme nvme1: 15/0/0 default/read/poll queues [ 2721.182884] nvme nvme1: I/O 0 QID 4 timeout, aborting [ 2721.182899] nvme nvme1: I/O 1 QID 4 timeout, aborting [ 2721.182904] nvme nvme1: I/O 2 QID 4 timeout, aborting [ 2721.182909] nvme nvme1: I/O 3 QID 4 timeout, aborting [ 2721.182914] nvme nvme1: I/O 4 QID 4 timeout, aborting [ 2751.902167] nvme nvme1: I/O 0 QID 4 timeout, reset controller [ 2782.621470] nvme nvme1: I/O 15 QID 0 timeout, reset controller [ 2813.373247] nvme nvme1: Abort status: 0x371 [ 2813.373251] nvme nvme1: Abort status: 0x371 [ 2813.373253] nvme nvme1: Abort status: 0x371 [ 2813.373255] nvme nvme1: Abort status: 0x371 [ 2813.373256] nvme nvme1: Abort status: 0x371 [ 2813.420857] nvme nvme1: 15/0/0 default/read/poll queues $ sudo smartctl -a /dev/nvme1 smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.19.0-43-generic] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: ADATA SX8200PNP Serial Number: 2K342LQHNHJ2 Firmware Version: 42B2S7JA PCI Vendor/Subsystem ID: 0x1cc1 IEEE OUI Identifier: 0x000000 Controller ID: 1 NVMe Version: 1.3 Number of Namespaces: 1 Namespace 1 Size/Capacity: 1.024.209.543.168 [1,02 TB] Namespace 1 Utilization: 892.418.854.912 [892 GB] Namespace 1 Formatted LBA Size: 512 Local Time is: Sat Jun 3 13:06:09 2023 -03 Firmware Updates (0x14): 2 Slots, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 64 Pages Warning Comp. Temp. Threshold: 75 Celsius Critical Comp. Temp. Threshold: 80 Celsius Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 9.00W - - 0 0 0 0 0 0 1 + 4.60W - - 1 1 1 1 0 0 2 + 3.80W - - 2 2 2 2 0 0 3 - 0.0450W - - 3 3 3 3 2000 2000 4 - 0.0040W - - 4 4 4 4 15000 15000 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 39 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 10% Data Units Read: 28.790.609 [14,7 TB] Data Units Written: 98.726.571 [50,5 TB] Host Read Commands: 605.778.838 Host Write Commands: 1.588.032.646 Controller Busy Time: 21.672 Power Cycles: 1.188 Power On Hours: 9.520 Unsafe Shutdowns: 151 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Error Information (NVMe Log 0x01, 16 of 256 entries) No Errors Logged Additionally, my system freezes for like 40 seconds or more and only comes back to work after I press CTRL + ALT + F2, wait a few seconds, and press CTRL + ALT + F1. I am trying to identify if the problem is caused by some misconfiguration in the Kernel, or by the SSD controller or SSD NAND malfunction (The ADATA SSD Toolbox on Windows 10 does not report any errors.), or by some power setting as described in https://wiki.archlinux.org/title/Solid_state_drive/NVMe#Controller_failure_due_to_broken_APST_support or any NVIDIA official driver misbehavior, some problem on X, or any excessive use of inodes from Jetbrains Pycharm or Android Studio IDE's If anyone knows how I can get more useful logs to help figure out the cause of the problem, I would appreciate it. |