Bug 217863
Summary: | Lexar NM790 SSDs are not recognized anymore after 6.1.50 LTS | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Claudio Sampaio (patola) |
Component: | NVMe | Assignee: | IO/NVME Virtual Default Assignee (io_nvme) |
Status: | RESOLVED CODE_FIX | ||
Severity: | high | CC: | andreas.cm, dominikowski, felixonmars, kbusch, mail, mario.limonciello, patola, rauchwolke, ruedi.reifen, se77en.cc, tripmag7 |
Priority: | P3 | ||
Hardware: | AMD | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | Yes | Bisected commit-id: | |
Attachments: |
lexar patch
proposed new patch |
Description
Claudio Sampaio
2023-09-02 14:01:18 UTC
Sorry, forgot to tell, PCI ID of my device: 1d97:1602 Adding the two lines │ 3457 { PCI_DEVICE(0x1d97, 0x1602), /* Lexar NM790 */ │ 3458 │ .driver_data = NVME_QUIRK_BOGUS_NID, }, in file drivers/nvme/host/pci.c made my NVMe work correctly. Compiled a new 6.5.1 kernel and everything works. As you already identified a solution; can you format this as a proper patch and send it to the linux-nvme mailing list? Sorry; I see there is a related conversation for this. https://lore.kernel.org/linux-nvme/CA+4wXKAe3==K1CoSvmRpbmPcR9QK+EuzDB-cNteNB34Cgur_0w@mail.gmail.com/T/#ma4e199c887a1b0d1feacb5331a3a6ad9dabeb91f Can you please reproduce this with 6.5.1 as requested? Created attachment 305061 [details] lexar patch i can reproduce the bug too. using this patch fixes the problem for me. the patch is from: https://lore.kernel.org/lkml/7cd693dd-a6d7-4aab-aef0-76a8366ceee6@archlinux.org/ the dmesg output: [ 6.547090] nvme nvme0: pci function 0000:3d:00.0 [ 6.559676] nvme nvme0: [PATCH] nvme core got timeout 0 [ 6.559683] nvme nvme0: [PATCH] nvme_wait_ready now wait for 2, previously 0 [ 6.563149] nvme nvme0: allocated 40 MiB host memory buffer. i can reproduce the bug too - with 6.5.2 (In reply to Thomas Mann from comment #5) > > i can reproduce the bug too. using this patch fixes the problem for me. the > patch is from: > https://lore.kernel.org/lkml/7cd693dd-a6d7-4aab-aef0-76a8366ceee6@archlinux. > org/ This patch makes sense from the observation. The device is not providing an appropriate time to ready, so making it larger sounds like the right direction. It'll probably need to be a new quirk, though. Created attachment 305072 [details]
proposed new patch
Hi,
I think I have found a better solution. Please try my new attached patch if you are interested :)
(In reply to Felix Yan from comment #8) > Created attachment 305072 [details] > proposed new patch > > Hi, > > I think I have found a better solution. Please try my new attached patch if > you are interested :) Hi, the patch fixes the bug, at least i couldn't reproduce the error by now. [ 6.547047] nvme nvme0: pci function 0000:3d:00.0 [ 6.556641] nvme nvme0: Ignoring bogus CRTO (0), falling back to NVME_CAP_TIMEOUT (255) [ 6.562276] nvme nvme0: allocated 40 MiB host memory buffer. [ 6.572543] nvme nvme0: 8/0/0 default/read/poll queues [ 6.616338] nvme0n1: p1 p2 I can confirm this new patch fixes the problem for me on Archlinux with kernel linux-next. Thanks a lot! Best Andreas Hi! I had two problems with my Lexar NM790: 1. NVME Device not ready at booting 2. Freeze at waking up from sleep mode that occurred in about 75 % of wake ups. Hard reset was needed. This new patch: https://bugzilla.kernel.org/attachment.cgi?id=305072&action=diff solves all those problems completely. I have Asus ExpertBook B9400 and Kubuntu 23.04 on 6.2.0-31 Kernel. Thanks guys! Upstream and stable have applied an appropriate patch to address this bz, so should be fixed at the next release tag. Hi, another confirm here. I tested with (2x) 4TB Lexar NM790 (0x1d97, 0x1602) with build based on kernel-6.2.16. Issues were: 1. (allmost every boot) nvme nvme1: Device not ready; aborting initialisation, CSTS=0x0 2. (occasionally) nvme nvme0: missing or invalid SUBNQN field. After applying the patch from Felix https://bugzilla.kernel.org/attachment.cgi?id=305072&action=diff no issues seen anymore. I'm wondering if the invalid SUBNQN field is addressed by the patch too. Thanks I have the same problem with a 4TB Lexar NM790 I only use Linux occasionally and I'm basically lost in it... :D Could someone explain to me how to proceed and apply the patch for a new installation of Ubuntu 22.04? Thanks @Tripmag then you are probably on kernel version 5.15.x. This patch fixes an initialization problem of the controller on the SSD that was made visible after kernel version 6.1.x. You would need a kernel new enough (> ?) to support the drive and old enough <=6.1.x. With newer kernels than 6.1.x you would need to build a custom kernel with this patch until your linux distribution catches up. In your case it is probably best to take this to your distribution and ask people there what is best for you. Someone might just have a fitting repo with a kernel or suggest an upgrade to a newer version of the distro. 6.5.5 fixes the bug, aka i couldn't reproduce the bug by now thank you, I will try to go this way again I also thought about a fresh installation on a 1TB disk, then an update + patch and cloning to a 4TB Lexar but I've never done a kernel patch and it seems out of my league... maybe I'll try to install 23.10 on 1TB update kernel to 6.5.5 and clone to 4TB OK install 23.10 on 1TB update kernel to 6.5.5 and clone to 4TB this is my working variant for now and an opportunity to test this newer version For me it is safer to use Kernel tested with my distro version. I am using Kubuntu so probably I will need to patch and build Kernel until my distro will receive at least 6.5.5 version. If someone needs to go the same way, here are tutorial how to patch and build Kernel: https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel https://wiki.ubuntu.com/Kernel/Dev/KernelGitGuide Patch using git: https://www.specbee.com/blogs/how-create-and-apply-patch-git-diff-and-git-apply-commands-your-drupal-website I am using secure boot so signing Kernel built is also crucial: https://gloveboxes.github.io/Ubuntu-for-Azure-Developers/docs/signing-kernel-for-secure-boot.html I had some problems with building Kernel 6.2.0 with rust so I have built it without rust using this command: $ cd debian/build/build-generic $ make bindeb-pkg ARCH=x86 CROSS_COMPILE=x86_64-linux-gnu- HOSTCC=x86_64-linux-gnu-gcc-12 CC=x86_64-linux-gnu-gcc-12 KERNELVERSION=6.2.0-31-generic CONFIG_DEBUG_SECTION_MISMATCH=y KBUILD_BUILD_VERSION="31" LOCALVERSION= localver-extra= CFLAGS_MODULE="-DPKG_ABI=31" PYTHON=python3 O=/home/haz/lunar/debian/build/build-generic -j8 olddefconfig Good Luck! Sorry for commenting on a closed bug, I have a 2TB Lexar NM790 which only works with kernel 6.1, with a livecd with kernel 6.5.7 it's effectively invisible during boot, with earlier versions (6.2-6.5.x) I had the same issue as the bug report, now I don't even have that. The only thing I see is: nvme 0000:02:00.0: platform quirk: setting simple suspend lsblk still shows no nvme devices. How do I report this? (In reply to Tomasz from comment #20) > Sorry for commenting on a closed bug, I have a 2TB Lexar NM790 which only > works with kernel 6.1, with a livecd with kernel 6.5.7 it's effectively > invisible during boot, with earlier versions (6.2-6.5.x) I had the same > issue as the bug report, now I don't even have that. The only thing I see is: > > nvme 0000:02:00.0: platform quirk: setting simple suspend > > lsblk still shows no nvme devices. How do I report this? Try using kernel 6.6.0-rc5. It has the fix, I'm using it right now and it also handles my AMD CPU and GPU better. seems all ssd using the MaxIO MAP1602A will be affected. I use Acer Predator SSD GM7 M.2 4TB and has same issue. Acer GM7 and Lexar NM790 both use MaxIO MAP1602A |