Created attachment 307616 [details] dmesg log with errors Hello. I noticed recently that on my motherboard - MSI A520M Pro - I got on the long run partition table corruption. When I connect an external SSD drive, My dmesg log is plagued with lines like: [ 114.674453] usb 6-2: reset SuperSpeed Plus Gen 2x1 USB device number 2 using xhci_hcd I see it on both 6.12.13 and 6.13.1 kernel on my archlinux. Adding both full dmesg log and dmidecode log.
Created attachment 307617 [details] dmidecode log
I also launch both: sudo fdisk -l /dev/sdb sudo smartctl -x /dev/sdb First command: sudo fdisk -l /dev/sdb Disk /dev/sdb: 465,76 GiB, 500107862016 bytes, 976773168 sectors Disk model: MobileDataStar Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: 0D820E40-5858-9090-8081-828310111213 Device Start End Sectors Size Type /dev/sdb1 2048 976773119 976771072 465,8G Microsoft basic data Second command? See attached file.
Created attachment 307639 [details] smartctl infos
Someone on this Archlinux forum thread - https://bbs.archlinux.org/viewtopic.php?pid=2226102 - told me to try disabling UAS for the external SSD / HHD. But without luck, still getting the usb reset error message.
Hi, 1. According to your dmesg snippet UAS was already disabled back then. 2. AFAIK usb-storage uses device reset to recover from various errors, maybe this is simply a matter of poor USB link quality. Try: echo 'func handle_tx_event +p' >/proc/dynamic_debug/control 3. Corruption sounds bad. Is it reproducible, i.e. you write more data and more problems show up? Are things still broken when the disk is read by other machines? 4. Does the same disk work any better on other machines? 5. FYI, some buggy USB SATA bridges report smaller than actual capacity, which can cause problems with reading GPT tables at the end of the disk.
Hello. 1. OK. Did not noticed it. 2. When I try this command line, I got zsh: permission denied: /proc/dynamic_debug/control 3. I tried on other computers and not corruption problem occurs. 4. Yes. I tried less /proc/dynamic_debug/control and got an enormous output. Adding it if it helps knowing what's going on.
Created attachment 307663 [details] /proc/dynamic_debug/control content
> 2. When I try this command line, I got zsh: permission denied: > /proc/dynamic_debug/control You need to be root for this and sudo won't help you without extra steps: https://stackoverflow.com/questions/82256/how-do-i-use-sudo-to-redirect-output-to-a-location-i-dont-have-permission-to-wr Once this works please run dmesg again and see if something new shows up between those "reset USB device" messages.
Modification done. I tried with my external USB HDD copying big files. I do not have access of the previous USB peripheral . Here is the output while copying 2 big tar.xz archives (6 Go each). [11012.004194] sd 6:0:0:0: [sdb] 976773164 512-byte logical blocks: (500 GB/466 GiB) [11012.004519] sd 6:0:0:0: [sdb] Write Protect is off [11012.004523] sd 6:0:0:0: [sdb] Mode Sense: 23 00 00 00 [11012.004846] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [11012.070815] sdb: sdb1 [11012.070970] sd 6:0:0:0: [sdb] Attached SCSI disk [11012.324232] xhci_hcd 0000:30:00.3: Stalled endpoint for slot 1 ep 2 [11355.077282] usb 4-1: USB disconnect, device number 2 [11355.166387] sd 6:0:0:0: [sdb] Synchronizing SCSI cache [11355.166436] sd 6:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK Weird there is no other output. I'll try another external USB SSD as soon as possible.
Sometimes intermittent errors are caused by a marginal or insufficient power supply. Maybe the USB-3 ports on your computer don't provide quite enough power for the drive to work properly. Does the SSD drive have its own power supply? If it doesn't, have you tried putting a powered USB hub between the computer and the drive?
(In reply to Alan Stern from comment #10) > Sometimes intermittent errors are caused by a marginal or insufficient power > supply. Maybe the USB-3 ports on your computer don't provide quite enough > power for the drive to work properly. The motherboard is new (3 months old) and same age for the power. > > Does the SSD drive have its own power supply? If it doesn't, have you tried > putting a powered USB hub between the computer and the drive? USB SSD was powered by the motherboard. And I do not own any powered USB hub.
The age doesn't matter. When you say the SSD was powered by the motherboard, do you mean there was a separate connection to the motherboard (not part of the USB cable) providing power for the drive? Or do you mean that the drive received its power over the USB cable, which was plugged into the motherboard? Even if you don't own a powered USB hub, you may be able to borrow one or buy one cheaply. (There are some available on Amazon for under $15.) I admit there's a good chance that this is not the explanation for your problems. But it might be. It would explain why the drive works with other computers but not with yours.
(In reply to Alan Stern from comment #12) [...] > > When you say the SSD was powered by the motherboard, do you mean there was a > separate connection to the motherboard (not part of the USB cable) providing > power for the drive? Or do you mean that the drive received its power over > the USB cable, which was plugged into the motherboard? No separate connection. The power was received through the USB cable. [...] > > I admit there's a good chance that this is not the explanation for your > problems. But it might be. It would explain why the drive works with other > computers but not with yours. It could be an answer to my problem, even if I doubt it.
I made an experience. A friend of mine swapped for a test my nvme with Archlinux - on my PC - on it by one with MS-Win11. I plugged the SSD in one of the motherboard USB port and no corruption or data loss occured, plugging / unplugging it a few times and nothing wrong happened. So it looks like it is a bug in the USB ports management of my motherboard with linux. I plugged the SSD on one of the USB port on my Pi 4 and access it with NFS. No problems at all.
One possibility is that the SSD doesn't like LPM. You can disable LPM by writing 0dd8:0562:k to /sys/module/usbcore/parameters/quirks before plugging in the drive. If that doesn't make any difference, you can try collecting a usbmon trace that shows the error occurring. Warning: The usbmon output file is likely to be enormous, and the interesting part will be only the stuff that gets written when the error happens.
(In reply to Alan Stern from comment #15) > One possibility is that the SSD doesn't like LPM. You can disable LPM by > writing > > 0dd8:0562:k > > to /sys/module/usbcore/parameters/quirks before plugging in the drive. As I said in comment 9, I don't any have access at all to the external SSD with the usv reset spam in dmesg. > > If that doesn't make any difference, you can try collecting a usbmon trace > that shows the error occurring. Warning: The usbmon output file is likely > to be enormous, and the interesting part will be only the stuff that gets > written when the error happens. I will need to buy another external SSD and see it I still see this bug. But it will take me around two weeks to do so :/
Some more infos. I made some research - searching for A520M USB problems - and found these forums threads: * https://forums.tomshardware.com/threads/about-amd-usb-issues.3698102/ (From April 2021). No solutions found. * https://community.amd.com/t5/general-discussions/a520m-boards-usb/td-p/545490 (From 2022). Solution? Buying an PCI-E USB card to avoid using the ports from the motherboard Not a big fan of avoiding USB ports from the motherboard.
Some additional infos. I thought it was an Archlinux bug. So I tried both Manjaro and Fedora live USB. And nothing changed.Ut is really a bad management of this motherboard USB ports. Hardware is not guilty here.
Oops. I meant USB HDD / SSD peripherals are not guilty here.
So I found an old 320 Gb USB HDD. Connected it to an USB port of my motherboard. It is nearly empty, so corruption is not really a problem for it. I copied a 3 Gb ISO image. I'm adding both dmesg.log and usbmon output. Looks like the "reset" lines only occurs with USB SSD... And as I don't have one under my hand for now... Well, this annoying bug is annoying to reproduce easily.
Created attachment 307720 [details] dmesg log using an USB HDD
Created attachment 307721 [details] usbmon output