System always become unresposive after target start with messages qla2xxx [0000:07:00.0]-00fb:1: QLogic QLE2564 - PCI-Express Quad Channel 8Gb Fibre Channel HBA. qla2xxx [0000:07:00.0]-00fc:1: ISP2532: PCIe (5.0GT/s x8) @ 0000:07:00.0 hdma+ host#=1 fw=8.06.00 (90d5). qla2xxx [0000:07:00.1]-001a: : MSI-X vector count: 32. qla2xxx [0000:07:00.1]-001d: : Found an ISP2532 irq 103 iobase 0xffffb830c62d5000. qla2xxx [0000:07:00.1]-504b:2: RISC paused -- HCCR=40, Dumping firmware. qla2xxx [0000:07:00.1]-8033:2: Unable to reinitialize FCE (258). qla2xxx [0000:07:00.1]-8034:2: Unable to reinitialize EFT (258). qla2xxx [0000:07:00.1]-00af:2: Performing ISP error recovery - ha=ffff88a2624e0000. qla2xxx [0000:07:00.1]-504b:2: RISC paused -- HCCR=40, Dumping firmware. trying - kernel 4.11-rc5 Apr 07 23:39:58 : ------------[ cut here ]------------ Apr 07 23:39:58 : WARNING: CPU: 0 PID: 1468 at lib/dma-debug.c:519 add_dma_entry+0x176/0x180 Apr 07 23:39:58 : DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000013e77000 Apr 07 23:39:58 : Modules linked in: vhost_net vhost tap tun ebtable_filter ebtables ip6table_filter ip6_tables tcm_qla2xxx target_core_user uio targ Apr 07 23:39:58 : nvme_core scsi_transport_sas Apr 07 23:39:58 : CPU: 0 PID: 1468 Comm: qemu-system-x86 Tainted: G W I 4.11.0-0.rc5.git3.1.fc27.x86_64 #1 Apr 07 23:39:58 : Hardware name: HP ProLiant DL180 G6 , BIOS O20 07/01/2013 Apr 07 23:39:58 : Call Trace: Apr 07 23:39:58 : dump_stack+0x8e/0xd1 Apr 07 23:39:58 : __warn+0xcb/0xf0 Apr 07 23:39:58 : warn_slowpath_fmt+0x5a/0x80 Apr 07 23:39:58 : ? active_cacheline_read_overlap+0x2e/0x60 Apr 07 23:39:58 : add_dma_entry+0x176/0x180 Apr 07 23:39:58 : debug_dma_map_sg+0x11a/0x170 Apr 07 23:39:58 : nvme_queue_rq+0x513/0x950 [nvme] Apr 07 23:39:58 : blk_mq_try_issue_directly+0xbb/0x110 Apr 07 23:39:58 : blk_mq_make_request+0x3a9/0xa70 Apr 07 23:39:58 : ? blk_queue_enter+0xa3/0x2c0 Apr 07 23:39:58 : ? blk_queue_enter+0x39/0x2c0 Apr 07 23:39:58 : ? generic_make_request+0xf9/0x3b0 Apr 07 23:39:58 : generic_make_request+0x126/0x3b0 Apr 07 23:39:58 : ? iov_iter_get_pages+0xc9/0x330 Apr 07 23:39:58 : submit_bio+0x73/0x150 Apr 07 23:39:58 : ? submit_bio+0x73/0x150 Apr 07 23:39:58 : ? bio_iov_iter_get_pages+0xe0/0x120 Apr 07 23:39:58 : blkdev_direct_IO+0x1f7/0x3e0 Apr 07 23:39:58 : ? SYSC_io_destroy+0x1d0/0x1d0 Apr 07 23:39:58 : ? __atime_needs_update+0x7f/0x1a0 Apr 07 23:39:58 : generic_file_read_iter+0x2e5/0xad0 Apr 07 23:39:58 : ? generic_file_read_iter+0x2e5/0xad0 Apr 07 23:39:58 : ? rw_copy_check_uvector+0x8a/0x180 Apr 07 23:39:58 : blkdev_read_iter+0x35/0x40 Apr 07 23:39:58 : aio_read+0xeb/0x150 Apr 07 23:39:58 : ? sched_clock+0x9/0x10 Apr 07 23:39:58 : ? sched_clock_cpu+0x11/0xc0 Apr 07 23:39:58 : ? __might_fault+0x3e/0x90 Apr 07 23:39:58 : ? __might_fault+0x3e/0x90 Apr 07 23:39:58 : do_io_submit+0x5f8/0x920 Apr 07 23:39:58 : ? do_io_submit+0x5f8/0x920 Apr 07 23:39:58 : SyS_io_submit+0x10/0x20 Apr 07 23:39:58 : ? SyS_io_submit+0x10/0x20 Apr 07 23:39:58 : entry_SYSCALL_64_fastpath+0x1f/0xc2 Apr 07 23:39:58 : RIP: 0033:0x7f73766216a7 Apr 07 23:39:58 : RSP: 002b:00007ffc9aac6108 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1 Apr 07 23:39:58 : RAX: ffffffffffffffda RBX: 000055617d90b900 RCX: 00007f73766216a7 Apr 07 23:39:58 : RDX: 00007ffc9aac6120 RSI: 0000000000000002 RDI: 00007f7377800000 Apr 07 23:39:58 : RBP: 0000000000000258 R08: 00007ffc9aac6440 R09: 000055617d9a2000 Apr 07 23:39:58 : R10: 0000556188f93cf0 R11: 0000000000000246 R12: 0000000000000280 Apr 07 23:39:58 : R13: 0000000000000130 R14: 0000000000000001 R15: 0000000000000011 Apr 07 23:39:58 : ---[ end trace 81f169903702b67d ]---
Hi Anthony, (In reply to Anthony from comment #0) > System always become unresposive after target start with messages > > qla2xxx [0000:07:00.0]-00fb:1: QLogic QLE2564 - PCI-Express Quad Channel 8Gb > Fibre Channel HBA. > qla2xxx [0000:07:00.0]-00fc:1: ISP2532: PCIe (5.0GT/s x8) @ 0000:07:00.0 > hdma+ host#=1 fw=8.06.00 (90d5). > qla2xxx [0000:07:00.1]-001a: : MSI-X vector count: 32. > qla2xxx [0000:07:00.1]-001d: : Found an ISP2532 irq 103 iobase > 0xffffb830c62d5000. > qla2xxx [0000:07:00.1]-504b:2: RISC paused -- HCCR=40, Dumping firmware. > qla2xxx [0000:07:00.1]-8033:2: Unable to reinitialize FCE (258). > qla2xxx [0000:07:00.1]-8034:2: Unable to reinitialize EFT (258). > qla2xxx [0000:07:00.1]-00af:2: Performing ISP error recovery - > ha=ffff88a2624e0000. > qla2xxx [0000:07:00.1]-504b:2: RISC paused -- HCCR=40, Dumping firmware. > > trying - kernel 4.11-rc5 > > Apr 07 23:39:58 : ------------[ cut here ]------------ > Apr 07 23:39:58 : WARNING: CPU: 0 PID: 1468 at lib/dma-debug.c:519 > add_dma_entry+0x176/0x180 > Apr 07 23:39:58 : DMA-API: exceeded 7 overlapping mappings of cacheline > 0x0000000013e77000 > Apr 07 23:39:58 : Modules linked in: vhost_net vhost tap tun ebtable_filter > ebtables ip6table_filter ip6_tables tcm_qla2xxx target_core_user uio targ > Apr 07 23:39:58 : nvme_core scsi_transport_sas > Apr 07 23:39:58 : CPU: 0 PID: 1468 Comm: qemu-system-x86 Tainted: G W > I 4.11.0-0.rc5.git3.1.fc27.x86_64 #1 > Apr 07 23:39:58 : Hardware name: HP ProLiant DL180 G6 , BIOS O20 07/01/2013 > Apr 07 23:39:58 : Call Trace: > Apr 07 23:39:58 : dump_stack+0x8e/0xd1 > Apr 07 23:39:58 : __warn+0xcb/0xf0 > Apr 07 23:39:58 : warn_slowpath_fmt+0x5a/0x80 > Apr 07 23:39:58 : ? active_cacheline_read_overlap+0x2e/0x60 > Apr 07 23:39:58 : add_dma_entry+0x176/0x180 > Apr 07 23:39:58 : debug_dma_map_sg+0x11a/0x170 > Apr 07 23:39:58 : nvme_queue_rq+0x513/0x950 [nvme] > Apr 07 23:39:58 : blk_mq_try_issue_directly+0xbb/0x110 > Apr 07 23:39:58 : blk_mq_make_request+0x3a9/0xa70 > Apr 07 23:39:58 : ? blk_queue_enter+0xa3/0x2c0 > Apr 07 23:39:58 : ? blk_queue_enter+0x39/0x2c0 > Apr 07 23:39:58 : ? generic_make_request+0xf9/0x3b0 > Apr 07 23:39:58 : generic_make_request+0x126/0x3b0 > Apr 07 23:39:58 : ? iov_iter_get_pages+0xc9/0x330 > Apr 07 23:39:58 : submit_bio+0x73/0x150 > Apr 07 23:39:58 : ? submit_bio+0x73/0x150 > Apr 07 23:39:58 : ? bio_iov_iter_get_pages+0xe0/0x120 > Apr 07 23:39:58 : blkdev_direct_IO+0x1f7/0x3e0 > Apr 07 23:39:58 : ? SYSC_io_destroy+0x1d0/0x1d0 > Apr 07 23:39:58 : ? __atime_needs_update+0x7f/0x1a0 > Apr 07 23:39:58 : generic_file_read_iter+0x2e5/0xad0 > Apr 07 23:39:58 : ? generic_file_read_iter+0x2e5/0xad0 > Apr 07 23:39:58 : ? rw_copy_check_uvector+0x8a/0x180 > Apr 07 23:39:58 : blkdev_read_iter+0x35/0x40 > Apr 07 23:39:58 : aio_read+0xeb/0x150 > Apr 07 23:39:58 : ? sched_clock+0x9/0x10 > Apr 07 23:39:58 : ? sched_clock_cpu+0x11/0xc0 > Apr 07 23:39:58 : ? __might_fault+0x3e/0x90 > Apr 07 23:39:58 : ? __might_fault+0x3e/0x90 > Apr 07 23:39:58 : do_io_submit+0x5f8/0x920 > Apr 07 23:39:58 : ? do_io_submit+0x5f8/0x920 > Apr 07 23:39:58 : SyS_io_submit+0x10/0x20 > Apr 07 23:39:58 : ? SyS_io_submit+0x10/0x20 > Apr 07 23:39:58 : entry_SYSCALL_64_fastpath+0x1f/0xc2 > Apr 07 23:39:58 : RIP: 0033:0x7f73766216a7 > Apr 07 23:39:58 : RSP: 002b:00007ffc9aac6108 EFLAGS: 00000246 ORIG_RAX: > 00000000000000d1 > Apr 07 23:39:58 : RAX: ffffffffffffffda RBX: 000055617d90b900 RCX: > 00007f73766216a7 > Apr 07 23:39:58 : RDX: 00007ffc9aac6120 RSI: 0000000000000002 RDI: > 00007f7377800000 > Apr 07 23:39:58 : RBP: 0000000000000258 R08: 00007ffc9aac6440 R09: > 000055617d9a2000 > Apr 07 23:39:58 : R10: 0000556188f93cf0 R11: 0000000000000246 R12: > 0000000000000280 > Apr 07 23:39:58 : R13: 0000000000000130 R14: 0000000000000001 R15: > 0000000000000011 > Apr 07 23:39:58 : ---[ end trace 81f169903702b67d ]--- We are working internally to reproduce this issue. we'll report what we find out from reproduction. Thanks, Himanshu
All my test and and production environments a the same: on tagret: RAID controller (HP SmartArray / Adaptec 65xx) BCache in writeback mode on Intel SSD NVME QLE2564 in target mode on initiator QLE2562 optical patches 1 meter (MM) without FC switch
Two cards installed in new test machine: QLE2462 and QLE2560 qle2560 spam with errors (without starting a target) scsi host8: qla2xxx qla2xxx [0000:05:00.1]-00fb:8: QLogic QLE2462 - PCI-Express Dual Channel 4Gb Fibre Channel HBA. qla2xxx [0000:05:00.1]-00fc:8: ISP2432: PCIe (2.5GT/s x4) @ 0000:05:00.1 hdma+ host#=8 fw=8.06.00 (9496). qla2xxx [0000:09:00.0]-001a: : MSI-X vector count: 32. qla2xxx [0000:09:00.0]-001d: : Found an ISP2532 irq 16 iobase 0xffffba4886335000. qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware. qla2xxx [0000:09:00.0]-d001:9: Firmware dump saved to temp buffer (9/ffffba488c001000), dump status flags (0x3f). qla2xxx [0000:09:00.0]-1005:9: Cmd 0x59 aborted with timeout since ISP Abort is pending scsi host9: qla2xxx qla2xxx [0000:09:00.0]-00fb:9: QLogic QLE2560 - PCI-Express Single Channel 8Gb Fibre Channel HBA. qla2xxx [0000:09:00.0]-00fc:9: ISP2532: PCIe (5.0GT/s x8) @ 0000:09:00.0 hdma+ host#=9 fw=8.06.00 (90d5). qla2xxx [0000:05:00.0]-500a:7: LOOP UP detected (4 Gbps). qla2xxx [0000:05:00.1]-500a:8: LOOP UP detected (4 Gbps). qla2xxx [0000:09:00.0]-00af:9: Performing ISP error recovery - ha=ffff98315ee30000. qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware. qla2xxx [0000:09:00.0]-d009:9: Firmware has been previously dumped (ffffba488c001000) -- ignoring request. qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware.
I'm trying to select working setting, and some investigation shows: with parameter ql2xmqsupport=0 - target starting to work
----- Original Message ----- > From: bugzilla-daemon@bugzilla.kernel.org > To: linux-scsi@kernel.org > Sent: Tuesday, May 16, 2017 10:45:22 AM > Subject: [Bug 195285] qla2xxx FW immediatly crashing after target start > > https://bugzilla.kernel.org/show_bug.cgi?id=195285 > > --- Comment #4 from Anthony (anthony.bloodoff@gmail.com) --- > I'm trying to select working setting, and some investigation shows: > with parameter ql2xmqsupport=0 - target starting to work > > -- > You are receiving this mail because: > You are watching the assignee of the bug. > OK, I default to MQ so will look at testing without it later.
Hi Anthony, Laurence, Can you try attached patch to see if it works for you? if Yes, I'll send out to SCSI mailing list to be included into upstream. Thanks, Himanshu
Created attachment 256619 [details] Patch to address target configuration for ISP25xx
----- Original Message ----- > From: bugzilla-daemon@bugzilla.kernel.org > To: linux-scsi@kernel.org > Sent: Thursday, May 18, 2017 2:09:51 PM > Subject: [Bug 195285] qla2xxx FW immediatly crashing after target start > > https://bugzilla.kernel.org/show_bug.cgi?id=195285 > > --- Comment #6 from himanshu.madhani@cavium.com (himanshu.madhani@qlogic.com) > --- > Hi Anthony, Laurence, > > Can you try attached patch to see if it works for you? > > if Yes, I'll send out to SCSI mailing list to be included into upstream. > > Thanks, > Himanshu > > -- > You are receiving this mail because: > You are watching the assignee of the bug. > Absolutely, and thanks Regards Laurence
patch work fine on 4.12.0-0.rc1 with ql2xmqsupport enabled
Hi Anthony, (In reply to Anthony from comment #9) > patch work fine on 4.12.0-0.rc1 with ql2xmqsupport enabled Thanks for validation. I'll send this patch to scsi tree with proper tags. -Himanshu
----- Original Message ----- > From: "Laurence Oberman" <loberman@redhat.com> > To: bugzilla-daemon@bugzilla.kernel.org > Cc: linux-scsi@kernel.org > Sent: Thursday, May 18, 2017 2:11:43 PM > Subject: Re: [Bug 195285] qla2xxx FW immediatly crashing after target start > > > > ----- Original Message ----- > > From: bugzilla-daemon@bugzilla.kernel.org > > To: linux-scsi@kernel.org > > Sent: Thursday, May 18, 2017 2:09:51 PM > > Subject: [Bug 195285] qla2xxx FW immediatly crashing after target start > > > > https://bugzilla.kernel.org/show_bug.cgi?id=195285 > > > > --- Comment #6 from himanshu.madhani@cavium.com > > (himanshu.madhani@qlogic.com) > > --- > > Hi Anthony, Laurence, > > > > Can you try attached patch to see if it works for you? > > > > if Yes, I'll send out to SCSI mailing list to be included into upstream. > > > > Thanks, > > Himanshu > > > > -- > > You are receiving this mail because: > > You are watching the assignee of the bug. > > > Absolutely, and thanks > Regards > Laurence Its working fine for me too now Thanks!! Laurence
(In reply to loberman from comment #11) > ----- Original Message ----- > > From: "Laurence Oberman" <loberman@redhat.com> > > To: bugzilla-daemon@bugzilla.kernel.org > > Cc: linux-scsi@kernel.org > > Sent: Thursday, May 18, 2017 2:11:43 PM > > Subject: Re: [Bug 195285] qla2xxx FW immediatly crashing after target start > > > > > > > > ----- Original Message ----- > > > From: bugzilla-daemon@bugzilla.kernel.org > > > To: linux-scsi@kernel.org > > > Sent: Thursday, May 18, 2017 2:09:51 PM > > > Subject: [Bug 195285] qla2xxx FW immediatly crashing after target start > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=195285 > > > > > > --- Comment #6 from himanshu.madhani@cavium.com > > > (himanshu.madhani@qlogic.com) > > > --- > > > Hi Anthony, Laurence, > > > > > > Can you try attached patch to see if it works for you? > > > > > > if Yes, I'll send out to SCSI mailing list to be included into upstream. > > > > > > Thanks, > > > Himanshu > > > > > > -- > > > You are receiving this mail because: > > > You are watching the assignee of the bug. > > > > > Absolutely, and thanks > > Regards > > Laurence > > Its working fine for me too now > Thanks!! > Laurence Thanks Laurence. Appreciate your effort on testing this out.
Patch in upstream now. Please close the bug.
I am OOO. I will respond to your message when i am back at work. Thanks, Himanshu