Bug 209177
Summary: | mpt2sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:10791/_scsih_probe()! | ||
---|---|---|---|
Product: | SCSI Drivers | Reporter: | sun.nagarajan |
Component: | Other | Assignee: | drivers_other |
Status: | NEW --- | ||
Severity: | high | CC: | ajb, growthstarboard, kennethjoyce549, kernel, linux-scsi, mason, ranjan.kumar, sun.nagarajan, toracat, zeph |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 5.8.6 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Full dmesg with kernel 5.8.12 (dated 20200729)
dmesg.mpt3sas.max_queue_depth.20200930 |
Description
sun.nagarajan
2020-09-06 21:40:28 UTC
Kernel version 5.8.12 with the following patches were suggested by Suganath Prabu Subramani on Sep-29-2020: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/mpt3sas?h=v5.9-rc4&id=61e6ba03ea26f0205e535862009ff6ffdbf4de0c https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/mpt3sas?h=v5.9-rc4&id=f56577e8c7d0f3054f97d1f0d1cbe9a4d179cc47 I built kernel version 5.8.12, and these patches were already applied. The problem still exists. EDITED dmesg output below: [ 10.110816] mpt2sas_cm0: mpt3sas_base_attach [ 10.110913] dca service started, version 1.12.1 [ 10.122668] mpt2sas_cm0: mpt3sas_base_map_resources [ 10.140735] usb 2-1.7: New USB device found, idVendor=1546, idProduct=01a6, bcdDevice= 7.03 [ 10.147693] scsi host2: ahci [ 10.163432] usb 2-1.7: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 10.173819] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (197972228 kB) [ 10.189366] usb 2-1.7: Product: u-blox 6 - GPS Receiver [ 10.206466] mpt2sas_cm0: _base_get_ioc_facts [ 10.219986] usb 2-1.7: Manufacturer: u-blox AG - www.u-blox.com [ 10.246805] mpt2sas_cm0: _base_wait_for_iocstate [ 10.260177] scsi host3: ahci [ 10.271074] scsi host4: ahci [ 10.281958] scsi host5: ahci [ 10.292565] scsi host6: ahci [ 10.299138] usb 2-1.8: new full-speed USB device number 6 using ehci-pci [ 10.303153] scsi host7: ahci [ 10.328158] ata1: SATA max UDMA/133 abar m2048@0xd1700000 port 0xd1700100 irq 53 [ 10.343989] ata2: SATA max UDMA/133 abar m2048@0xd1700000 port 0xd1700180 irq 53 [ 10.359546] ata3: SATA max UDMA/133 abar m2048@0xd1700000 port 0xd1700200 irq 53 [ 10.374807] ata4: SATA max UDMA/133 abar m2048@0xd1700000 port 0xd1700280 irq 53 [ 10.389813] ata5: SATA max UDMA/133 abar m2048@0xd1700000 port 0xd1700300 irq 53 [ 10.404635] ata6: SATA max UDMA/133 abar m2048@0xd1700000 port 0xd1700380 irq 53 [ 10.412371] scsi 0:0:0:0: Direct-Access SanDisk Ultra Fit 1.00 PQ: 0 ANSI: 6 [ 10.433718] usb 2-1.8: New USB device found, idVendor=051d, idProduct=0003, bcdDevice= 1.06 [ 10.435546] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 10.450887] usb 2-1.8: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 10.464152] offset:data [ 10.478544] usb 2-1.8: Product: Smart-UPS 2200 FW:UPS 06.3 / MCU 11.0 [ 10.488004] mpt2sas_cm0: [0x00]:03100200 [ 10.488004] mpt2sas_cm0: [0x04]:00002300 [ 10.488005] mpt2sas_cm0: [0x08]:00000000 [ 10.488005] mpt2sas_cm0: [0x0c]:00000000 [ 10.488006] mpt2sas_cm0: [0x10]:00000000 [ 10.488007] mpt2sas_cm0: [0x14]:00010080 [ 10.488007] mpt2sas_cm0: [0x18]:22137ec7 [ 10.488008] mpt2sas_cm0: [0x1c]:0001285c [ 10.488017] mpt2sas_cm0: [0x20]:14000600 [ 10.501945] usb 2-1.8: Manufacturer: American Power Conversion [ 10.501961] usb 2-1.8: SerialNumber: JS1051006712 [ 10.513140] mpt2sas_cm0: [0x24]:00000020 [ 10.513140] mpt2sas_cm0: [0x28]:04000020 [ 10.513141] mpt2sas_cm0: [0x2c]:00810080 [ 10.513141] mpt2sas_cm0: [0x30]:007f0003 [ 10.513142] mpt2sas_cm0: [0x34]:0020ffe0 [ 10.513154] mpt2sas_cm0: [0x38]:008004b0 [ 10.513154] mpt2sas_cm0: [0x3c]:00000011 [ 10.513155] mpt2sas_cm0: [0x40]:00000000 [ 10.513156] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 10.524350] sd 0:0:0:0: [sda] 30031250 512-byte logical blocks: (15.4 GB/14.3 GiB) [ 10.535178] mpt2sas_cm0: CurrentHostPageSize(0) [ 10.548205] sd 0:0:0:0: [sda] Write Protect is off [ 10.556610] mpt2sas_cm0: hba queue depth(32455), max chains per io(128) [ 10.566972] sd 0:0:0:0: [sda] Mode Sense: 43 00 00 00 [ 10.577132] mpt2sas_cm0: request frame size(128), reply frame size(128) [ 10.589074] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 10.597175] mpt2sas_cm0: msix is supported, vector_count(1) [ 10.692084] hid: raw HID events driver (C) Jiri Kosina [ 10.692148] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.6.0-k [ 10.692149] igb: Copyright (c) 2007-2014 Intel Corporation. [ 10.705215] mpt2sas_cm0: MSI-X vectors supported: 1 [ 10.705216] no of cores: 32, max_msix_vectors: -1 [ 10.705217] mpt2sas_cm0: 0 1 [ 10.705359] mpt2sas_cm0: High IOPs queues : disabled [ 10.757534] ata4: SATA link down (SStatus 0 SControl 300) [ 10.761609] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 56 [ 10.761611] mpt2sas_cm0: iomem(0x00000000d1380000), mapped(0x(____ptrval____)), size(16384) [ 10.761613] mpt2sas_cm0: ioport(0x0000000000002000), size(256) [ 10.781648] ata1: SATA link down (SStatus 0 SControl 300) [ 10.793026] mpt2sas_cm0: _base_get_ioc_facts [ 10.804281] ata6: SATA link down (SStatus 0 SControl 300) [ 10.817492] mpt2sas_cm0: _base_wait_for_iocstate [ 10.821742] usbcore: registered new interface driver usbhid [ 10.821743] usbhid: USB HID core driver [ 10.829361] ata3: SATA link down (SStatus 0 SControl 300) [ 10.906674] offset:data [ 10.917639] ata5: SATA link down (SStatus 0 SControl 300) [ 10.917791] input: American Megatrends Inc. Virtual Keyboard and Mouse as /devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.4/2-1.4:1.0/0003:046B:FF10.0001/input/input2 [ 10.917893] hid-generic 0003:046B:FF10.0001: input,hidraw0: USB HID v1.10 Keyboard [American Megatrends Inc. Virtual Keyboard and Mouse] on usb-0000:00:1d.0-1.4/input0 [ 10.918019] input: American Megatrends Inc. Virtual Keyboard and Mouse as /devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.4/2-1.4:1.1/0003:046B:FF10.0002/input/input3 [ 10.918245] hid-generic 0003:046B:FF10.0002: input,hidraw1: USB HID v1.10 Mouse [American Megatrends Inc. Virtual Keyboard and Mouse] on usb-0000:00:1d.0-1.4/input1 [ 10.918692] hid-generic 0003:051D:0003.0003: hiddev0,hidraw2: USB HID v1.00 Device [American Power Conversion Smart-UPS 2200 FW:UPS 06.3 / MCU 11.0] on usb-0000:00:1d.0-1.8/input0 [ 10.925117] random: fast init done [ 10.929067] mpt2sas_cm0: [0x00]:03100200 [ 10.939600] ata2: SATA link down (SStatus 0 SControl 300) [ 10.951294] mpt2sas_cm0: [0x04]:00002300 [ 10.984639] sda: sda1 sda2 sda3 [ 10.985180] mpt2sas_cm0: [0x08]:00000000 [ 11.005873] sd 0:0:0:0: [sda] Attached SCSI removable disk [ 11.006343] mpt2sas_cm0: [0x0c]:00000000 [ 11.285853] mpt2sas_cm0: [0x10]:00000000 [ 11.298311] mpt2sas_cm0: [0x14]:00010080 [ 11.310617] mpt2sas_cm0: [0x18]:22137ec7 [ 11.322831] mpt2sas_cm0: [0x1c]:0001285c [ 11.334964] mpt2sas_cm0: [0x20]:14000600 [ 11.347072] mpt2sas_cm0: [0x24]:00000020 [ 11.359060] mpt2sas_cm0: [0x28]:04000020 [ 11.370880] mpt2sas_cm0: [0x2c]:00810080 [ 11.382482] mpt2sas_cm0: [0x30]:007f0003 [ 11.393927] mpt2sas_cm0: [0x34]:0020ffe0 [ 11.405226] mpt2sas_cm0: [0x38]:008004b0 [ 11.416400] mpt2sas_cm0: [0x3c]:00000011 [ 11.427427] mpt2sas_cm0: [0x40]:00000000 [ 11.438335] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 11.453888] mpt2sas_cm0: CurrentHostPageSize(0) [ 11.465416] mpt2sas_cm0: hba queue depth(32455), max chains per io(128) [ 11.479358] mpt2sas_cm0: request frame size(128), reply frame size(128) [ 11.493291] mpt2sas_cm0: _base_make_ioc_ready [ 11.507135] mpt2sas_cm0: _base_get_port_facts [ 11.519349] igb 0000:07:00.0: added PHC on eth0 [ 11.530468] igb 0000:07:00.0: Intel(R) Gigabit Ethernet Network Connection [ 11.544129] igb 0000:07:00.0: eth0: (PCIe:5.0Gb/s:Width x4) 00:1e:67:97:4d:e9 [ 11.558034] igb 0000:07:00.0: eth0: PBA No: 100000-000 [ 11.569355] igb 0000:07:00.0: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s) [ 11.616691] offset:data [ 11.624765] mpt2sas_cm0: [0x00]:05070000 [ 11.634321] mpt2sas_cm0: [0x04]:00000000 [ 11.643579] mpt2sas_cm0: [0x08]:00000000 [ 11.652537] mpt2sas_cm0: [0x0c]:00000000 [ 11.661248] mpt2sas_cm0: [0x10]:00000000 [ 11.669892] mpt2sas_cm0: [0x14]:00003000 [ 11.678382] mpt2sas_cm0: [0x18]:00000100 [ 11.686741] mpt2sas_cm0: _base_allocate_memory_pools [ 11.696171] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15) [ 11.715890] ------------[ cut here ]------------ [ 11.725227] WARNING: CPU: 0 PID: 5 at mm/page_alloc.c:4831 __alloc_pages_nodemask+0x1ce/0x310 [ 11.739330] Modules linked in: fjes(-) hid_generic usbhid hid crct10dif_pclmul igb(+) crc32_pclmul ghash_clmulni_intel dca aesni_intel ptp ahci crypto_simd mpt3sas(+) pps_core xhci_pci cryptd mlx4_core(+) raid_class i2c_algo_bit libahci xhci_pci_renesas glue_helper scsi_transport_sas wmi uas usb_storage deflate [ 11.791023] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.8.12 #1 [ 11.803622] Hardware name: ZTSYSTEM CYPRESS11 /S2600CP , BIOS SE5C600.86B.02.06.0006.032420170950 03/24/2017 [ 11.827610] Workqueue: events work_for_cpu_fn [ 11.838884] RIP: 0010:__alloc_pages_nodemask+0x1ce/0x310 [ 11.851367] Code: ff ff ff 65 48 8b 04 25 c0 7b 01 00 48 05 78 08 00 00 41 bd 01 00 00 00 48 89 44 24 08 e9 05 ff ff ff 81 e7 00 20 00 00 75 02 <0f> 0b 45 31 ed eb 95 44 8b 64 24 18 65 8b 05 1f a6 7a 4b 89 c0 48 [ 11.893686] RSP: 0018:ffffc18e000bbc98 EFLAGS: 00010246 [ 11.906822] RAX: 0000000000000000 RBX: 0000000000000cc0 RCX: 0000000000000000 [ 11.922228] RDX: 0000000000000000 RSI: 000000000000000b RDI: 0000000000000000 [ 11.937510] RBP: 000000000075d000 R08: 000000000075d000 R09: ffffffffffffffff [ 11.952755] R10: 0000000000000000 R11: ffff9e6a16c22350 R12: ffffffffffffffff [ 11.967942] R13: 0000000000000000 R14: ffff9e5215c34f58 R15: ffff9e52163590b0 [ 11.983165] FS: 0000000000000000(0000) GS:ffff9e521ea00000(0000) knlGS:0000000000000000 [ 11.999566] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 12.013320] CR2: 000055c7853e9ef0 CR3: 00000003d620a003 CR4: 00000000000606f0 [ 12.028719] Call Trace: [ 12.038777] dma_direct_alloc_pages+0x171/0x2a0 [ 12.051185] dma_pool_alloc+0xd0/0x1c0 [ 12.062585] base_alloc_rdpq_dma_pool+0x118/0x1d0 [mpt3sas] [ 12.076131] _base_allocate_memory_pools+0x2d6/0x1240 [mpt3sas] [ 12.090232] mpt3sas_base_attach+0x4a4/0x930 [mpt3sas] [ 12.103599] _scsih_probe+0x4e3/0x920 [mpt3sas] [ 12.116383] local_pci_probe+0x42/0x90 [ 12.128401] work_for_cpu_fn+0x16/0x20 [ 12.140466] process_one_work+0x208/0x400 [ 12.152910] worker_thread+0x221/0x3e0 [ 12.165053] ? process_one_work+0x400/0x400 [ 12.177573] kthread+0x117/0x130 [ 12.188759] ? kthread_park+0x90/0x90 [ 12.200400] ret_from_fork+0x22/0x30 [ 12.211748] ---[ end trace 1d2f9a5394100a7e ]--- [ 12.224134] mpt2sas_cm0: mpt3sas_base_free_resources [ 12.237582] mpt2sas_cm0: _base_make_ioc_ready [ 12.249253] mpt2sas_cm0: mpt3sas_base_unmap_resources [ 12.264417] igb 0000:07:00.1: added PHC on eth1 [ 12.276024] igb 0000:07:00.1: Intel(R) Gigabit Ethernet Network Connection [ 12.290184] igb 0000:07:00.1: eth1: (PCIe:5.0Gb/s:Width x4) 00:1e:67:97:4d:ea [ 12.304604] igb 0000:07:00.1: eth1: PBA No: 100000-000 [ 12.316624] igb 0000:07:00.1: Using MSI-X interrupts. 8 rx queue(s), 8 tx queue(s) [ 12.331505] mpt2sas_cm0: _base_release_memory_pools [ 12.343209] mpt2sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:10791/_scsih_probe()! Created attachment 292719 [details]
Full dmesg with kernel 5.8.12 (dated 20200729)
(2020-09-30): Suganath Prabu Subramani suggested the following: From log, I could see that the HBA queue depth is very high "32455" as shown below. [ 11.465416] mpt2sas_cm0: hba queue depth(32455), max chains per io(128). In this patch "https://patchwork.kernel.org/patch/11505139/" driver is allocating the DMA-able memory for RDPQ's in sets of 16 reply queues using limitation of Ventura series controller. With 32455 queue depth and above patch, Driver may request a large DMA-able memory where the kernel may fail to allocate. To confirm this, Please try by tuning the queue depth to 8000/10000 using the module parameter "mpt3sas.max_queue_depth=10000". With the same unmodified 5.8.12 kernel, boot completed with no errors. Attaching dmesg.mpt3sas.max_queue_depth.20200930 Created attachment 292733 [details]
dmesg.mpt3sas.max_queue_depth.20200930
This bug has been reported for Fedora (kernel 5.8.6): https://bugzilla.redhat.com/show_bug.cgi?id=1878332 I need to verify whether there's a Debian bug for it yet or not, but I noticed the bug when upgrading from Debian Buster to Bullseye, running 5.10.84: 5.10.0-10-amd64 #1 SMP Debian 5.10.84-1 (2021-12-08) x86_64 GNU/Linux I can confirm that setting "mpt3sas.max_queue_depth=10000" worked there, albeit with some unusual and potentially cosmetic errors I'll transcribe later, once I get them in textual form. Hi, I'm having probably a related issue to this one on 6.1.0-21-amd64 Debian/bookwork (plain new installation). I cannot see any of the devices attached to this card, lsscsi returns only the devices attached to the motherboard 23:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03) Subsystem: Broadcom / LSI 9210-8i Flags: bus master, fast devsel, latency 0, IRQ 40, IOMMU group 25 I/O ports at e000 [size=256] Memory at e2540000 (64-bit, non-prefetchable) [size=16K] Memory at e2100000 (64-bit, non-prefetchable) [size=256K] Expansion ROM at <ignored> [disabled] Capabilities: [50] Power Management version 3 Capabilities: [68] Express Endpoint, MSI 00 Capabilities: [d0] Vital Product Data Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [c0] MSI-X: Enable+ Count=15 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [138] Power Budgeting <?> Capabilities: [150] Single Root I/O Virtualization (SR-IOV) Capabilities: [190] Alternative Routing-ID Interpretation (ARI) Kernel driver in use: mpt3sas Kernel modules: mpt3sas I attempted, as reported on several forums, to alter `mpt3sas.max_queue_depth=10000` as the Queue detected was pretty low mpt2sas_cm0: Current Controller Queue Depth(3364),Max Controller Queue Depth(3432) The change seem to have absolutely no effect... shall I open a new bug? Want me to create a bug at Debian first? how can I proceed here? I initially opened one here as the initial installation was a Mint, and the ticket was closed abruptly https://github.com/linuxmint/cinnamon/issues/12251 maybe worth adding the whole dmesg entries for such: mpt3sas version 43.100.00.00 loaded mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (32779040 kB) mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k mpt2sas_cm0: MSI-X vectors supported: 1 mpt2sas_cm0: 0 1 1 mpt2sas_cm0: High IOPs queues : disabled mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 100 mpt2sas_cm0: iomem(0x00000000e2540000), mapped(0x00000000dddd21ca), size(16384) mpt2sas_cm0: ioport(0x000000000000e000), size(256) mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15) mpt2sas_cm0: request pool(0x000000004246b71a) - dma(0xfed80000): depth(3492), frame_size(128), pool_size(436 kB) mpt2sas_cm0: sense pool(0x00000000333696a2) - dma(0xcfa80000): depth(3367), element_size(96), pool_size (315 kB) mpt2sas_cm0: reply pool(0x00000000eef15715) - dma(0xcfa00000): depth(3556), frame_size(128), pool_size(444 kB) mpt2sas_cm0: config page(0x000000009bf3e151) - dma(0xcf9fb000): size(512) mpt2sas_cm0: Allocated physical memory: size(7579 kB) mpt2sas_cm0: Current Controller Queue Depth(3364),Max Controller Queue Depth(3432) mpt2sas_cm0: Scatter Gather Elements per IO(128) mpt2sas_cm0: overriding NVDATA EEDPTagMode setting mpt2sas_cm0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00) mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) mpt2sas_cm0: sending port enable !! mpt2sas_cm0: hba_port entry: 00000000319a47bc, port: 255 is added to hba_port list mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x500605b002c8f75a), phys(8) mpt2sas_cm0: port enable: SUCCESS Hi, LSI SAS2008 PCI-Express Fusion-MPT is a pretty old card that has reached EOL. If any support is needed please contact Broadcom support channel. @Ranjan is it? https://www.amazon.com/LSI-9210-8i-8-port-PCIe-Controller/dp/B01D9V14F6 the card is the LSI 9210-8i, the fact that the drive supporting it identifies such as "LSI SAS2008 PCI-Express Fusion-MPT" is another topic I believe, no? Hi Guido, 92XX series cards are not supported. |