Bug 196307 - "unable to handle kernel paging request" if raid1 is under heavy load
Summary: "unable to handle kernel paging request" if raid1 is under heavy load
Status: RESOLVED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: MD (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: io_md
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-07-09 00:53 UTC by Patrick
Modified: 2017-10-03 16:26 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.12.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Patrick 2017-07-09 00:53:27 UTC
If either two raid checks run at the same time or one raid check runs and the disk is also being written to for a longer period simultaneously, I end up in this reproducibly:

----------------------------------------------------------
[105630.757063] md: data-check of RAID array md126
[105636.911071] md: data-check of RAID array md125
[105990.003534] BUG: unable to handle kernel paging request at ffffffffa01b73e6
[105990.003640] IP: report_bug+0x59/0xd4
[105990.003735] PGD 180a067
[105990.003735] P4D 180a067
[105990.003826] PUD 180b063
[105990.003918] PMD 1668b2067
[105990.004009] PTE 8000000169361161

[105990.004284] Oops: 0003 [#1] SMP
[105990.004377] Modules linked in: ipt_REJECT nf_reject_ipv4 nct6775 hwmon_vid nf_conntrack_ipv4 nf_defrag_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_conntrack nf_conntrack iptable_filter iptable_raw iptable_security xt_mark xt_owner iptable_mangle x86_pkg_temp_thermal coretemp raid1 pcspkr i2c_i801 i2c_core sg video nfsd auth_rpcgss oid_registry lockd grace sunrpc ip_tables dm_crypt crc32c_intel pcbc aesni_intel sd_mod aes_x86_64 crypto_simd cryptd glue_helper serio_raw r8169 mii xhci_pci xhci_hcd ipv6 crc_ccitt
[105990.004578] CPU: 1 PID: 7960 Comm: md125_resync Not tainted 4.12.0 #1
[105990.004677] Hardware name: MSI MS-7A74/B250M PRO-VH (MS-7A74), BIOS 1.30 02/03/2017
[105990.004779] task: ffff8801669a7080 task.stack: ffffc900034d8000
[105990.004879] RIP: 0010:report_bug+0x59/0xd4
[105990.004974] RSP: 0018:ffffc900034dba70 EFLAGS: 00010282
[105990.005112] RAX: 00000000a01b0907 RBX: ffffffffa01b408d RCX: ffffffffa01b73dc
[105990.005255] RDX: 0000000000000001 RSI: 000000000000030b RDI: ffffffffa01b408d
[105990.005397] RBP: ffffc900034dba88 R08: ffffffffa01b408d R09: 0000000000000001
[105990.005538] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffa01b714b
[105990.005680] R13: ffffc900034dbbc8 R14: 0000000000000006 R15: ffffc900034dbb00
[105990.005823] FS:  0000000000000000(0000) GS:ffff88016ed00000(0000) knlGS:0000000000000000
[105990.005969] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[105990.006108] CR2: ffffffffa01b73e6 CR3: 0000000169bc2000 CR4: 00000000003406e0
[105990.006250] Call Trace:
[105990.006387]  fixup_bug+0x21/0x38
[105990.006522]  do_trap+0x54/0x12c
[105990.006657]  do_error_trap+0xbe/0xcd
[105990.006794]  ? raid1_sync_request+0x732/0x88b [raid1]
[105990.006934]  ? set_next_entity+0x46/0x6d
[105990.007070]  do_invalid_op+0x1b/0x1d
[105990.007207]  invalid_op+0x18/0x20
[105990.007344] RIP: 0010:raid1_sync_request+0x732/0x88b [raid1]
[105990.007484] RSP: 0018:ffffc900034dbc78 EFLAGS: 00010212
[105990.007623] RAX: ffff880036b8b800 RBX: ffff880166929300 RCX: 0000000000000011
[105990.007767] RDX: 0000000000000010 RSI: 0000000006385f00 RDI: ffff880013d88e00
[105990.007908] RBP: ffffc900034dbd18 R08: ffffc900034dbcbc R09: 0000000000000000
[105990.008051] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880169295800
[105990.008194] R13: 0000000000000000 R14: 0000000000001000 R15: 0000000000000000
[105990.008340]  ? raid1_sync_request+0x6cc/0x88b [raid1]
[105990.008482]  ? __schedule+0x20e/0x358
[105990.008619]  ? finish_wait+0x61/0x61
[105990.008755]  md_do_sync+0x8a9/0xdb3
[105990.008892]  ? finish_wait+0x61/0x61
[105990.009029]  ? md_write_inc+0x2d/0x2d
[105990.009164]  md_thread+0x129/0x13f
[105990.009301]  ? md_thread+0x129/0x13f
[105990.009437]  ? __schedule+0x20e/0x358
[105990.009575]  ? md_write_inc+0x2d/0x2d
[105990.009712]  kthread+0xf4/0xfc
[105990.009847]  ? init_completion+0x24/0x24
[105990.009986]  ? SyS_exit_group+0xf/0xf
[105990.010122]  ret_from_fork+0x22/0x30
[105990.010256] Code: 55 4c 63 60 04 0f b7 70 08 49 01 c4 66 8b 40 0a 89 c2 83 e2 01 a8 02 74 16 66 85 d2 74 11 a8 04 41 b9 01 00 00 00 75 74 83 c8 04 <66> 89 41 0a 66 85 d2 75 05 41 89 f5 eb 23 0f b6 49 0b 45 31 c9
[105990.010526] RIP: report_bug+0x59/0xd4 RSP: ffffc900034dba70
[105990.010668] CR2: ffffffffa01b73e6
[105990.010804] ---[ end trace ce30e240b8ef7ab6 ]---
[106150.351789] BUG: unable to handle kernel paging request at ffffffffa01b73e6
[106150.351937] IP: report_bug+0x59/0xd4
[106150.352070] PGD 180a067
[106150.352070] P4D 180a067
[106150.352202] PUD 180b063
[106150.352333] PMD 1668b2067
[106150.352466] PTE 8000000169361161

[106150.352865] Oops: 0003 [#2] SMP
[106150.353000] Modules linked in: ipt_REJECT nf_reject_ipv4 nct6775 hwmon_vid nf_conntrack_ipv4 nf_defrag_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_conntrack nf_conntrack iptable_filter iptable_raw iptable_security xt_mark xt_owner iptable_mangle x86_pkg_temp_thermal coretemp raid1 pcspkr i2c_i801 i2c_core sg video nfsd auth_rpcgss oid_registry lockd grace sunrpc ip_tables dm_crypt crc32c_intel pcbc aesni_intel sd_mod aes_x86_64 crypto_simd cryptd glue_helper serio_raw r8169 mii xhci_pci xhci_hcd ipv6 crc_ccitt
[106150.353487] CPU: 0 PID: 7861 Comm: md126_resync Tainted: G      D         4.12.0 #1
[106150.353630] Hardware name: MSI MS-7A74/B250M PRO-VH (MS-7A74), BIOS 1.30 02/03/2017
[106150.353773] task: ffff88015241a1c0 task.stack: ffffc90003360000
[106150.353915] RIP: 0010:report_bug+0x59/0xd4
[106150.354062] RSP: 0018:ffffc90003363a70 EFLAGS: 00010282
[106150.354203] RAX: 00000000a01b0907 RBX: ffffffffa01b408d RCX: ffffffffa01b73dc
[106150.354346] RDX: 0000000000000001 RSI: 000000000000030b RDI: ffffffffa01b408d
[106150.354489] RBP: ffffc90003363a88 R08: ffffffffa01b408d R09: 0000000000000001
[106150.354632] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffa01b714b
[106150.354775] R13: ffffc90003363bc8 R14: 0000000000000006 R15: ffffc90003363b00
[106150.354918] FS:  0000000000000000(0000) GS:ffff88016ec00000(0000) knlGS:0000000000000000
[106150.355064] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[106150.355204] CR2: ffffffffa01b73e6 CR3: 0000000001809000 CR4: 00000000003406f0
[106150.355348] Call Trace:
[106150.355486]  fixup_bug+0x21/0x38
[106150.355622]  do_trap+0x54/0x12c
[106150.355759]  do_error_trap+0xbe/0xcd
[106150.355898]  ? raid1_sync_request+0x732/0x88b [raid1]
[106150.356039]  ? virt_to_head_page+0x3b/0x3d
[106150.356178]  ? kfree+0x24/0xaf
[106150.356314]  do_invalid_op+0x1b/0x1d
[106150.356452]  invalid_op+0x18/0x20
[106150.356590] RIP: 0010:raid1_sync_request+0x732/0x88b [raid1]
[106150.356730] RSP: 0018:ffffc90003363c78 EFLAGS: 00010212
[106150.356870] RAX: ffff88005efbac00 RBX: ffff880168ba6d00 RCX: 0000000000000011
[106150.357014] RDX: 0000000000000010 RSI: 0000000009260e80 RDI: ffff88003e6ea000
[106150.357158] RBP: ffffc90003363d18 R08: ffffc90003363cbc R09: 0000000000000000
[106150.357303] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8801669bf800
[106150.357446] R13: 0000000000000000 R14: 0000000000001000 R15: 0000000000000000
[106150.357594]  ? raid1_sync_request+0x6cc/0x88b [raid1]
[106150.357738]  ? is_mddev_idle+0x98/0xf5
[106150.357877]  md_do_sync+0x8a9/0xdb3
[106150.358015]  ? finish_wait+0x61/0x61
[106150.358153]  ? md_write_inc+0x2d/0x2d
[106150.358293]  md_thread+0x129/0x13f
[106150.358430]  ? md_thread+0x129/0x13f
[106150.358567]  ? __schedule+0x20e/0x358
[106150.358706]  ? md_write_inc+0x2d/0x2d
[106150.358844]  kthread+0xf4/0xfc
[106150.358980]  ? init_completion+0x24/0x24
[106150.359118]  ? SyS_exit_group+0xf/0xf
[106150.359256]  ret_from_fork+0x22/0x30
[106150.359394] Code: 55 4c 63 60 04 0f b7 70 08 49 01 c4 66 8b 40 0a 89 c2 83 e2 01 a8 02 74 16 66 85 d2 74 11 a8 04 41 b9 01 00 00 00 75 74 83 c8 04 <66> 89 41 0a 66 85 d2 75 05 41 89 f5 eb 23 0f b6 49 0b 45 31 c9
[106150.359671] RIP: report_bug+0x59/0xd4 RSP: ffffc90003363a70
[106150.359811] CR2: ffffffffa01b73e6
[106150.359948] ---[ end trace ce30e240b8ef7ab7 ]---
[106150.360090] ------------[ cut here ]------------
[106150.360230] WARNING: CPU: 0 PID: 7861 at kernel/exit.c:785 do_exit+0x5c/0x887
[106150.360377] Modules linked in: ipt_REJECT nf_reject_ipv4 nct6775 hwmon_vid nf_conntrack_ipv4 nf_defrag_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_conntrack nf_conntrack iptable_filter iptable_raw iptable_security xt_mark xt_owner iptable_mangle x86_pkg_temp_thermal coretemp raid1 pcspkr i2c_i801 i2c_core sg video nfsd auth_rpcgss oid_registry lockd grace sunrpc ip_tables dm_crypt crc32c_intel pcbc aesni_intel sd_mod aes_x86_64 crypto_simd cryptd glue_helper serio_raw r8169 mii xhci_pci xhci_hcd ipv6 crc_ccitt
[106150.360874] CPU: 0 PID: 7861 Comm: md126_resync Tainted: G      D         4.12.0 #1
[106150.361019] Hardware name: MSI MS-7A74/B250M PRO-VH (MS-7A74), BIOS 1.30 02/03/2017
[106150.361164] task: ffff88015241a1c0 task.stack: ffffc90003360000
[106150.361308] RIP: 0010:do_exit+0x5c/0x887
[106150.361445] RSP: 0018:ffffc90003363ef0 EFLAGS: 00010002
[106150.361585] RAX: ffffc90003363d90 RBX: ffff88015241a1c0 RCX: ffff88006f82b000
[106150.361730] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000009
[106150.361873] RBP: ffffc90003363f48 R08: 0000000000000000 R09: 0000000000000000
[106150.362017] R10: 0000000000000001 R11: 000000000000000f R12: 0000000000000009
[106150.362160] R13: 0000000000000046 R14: 0000000000000003 R15: 000000000000000b
[106150.362303] FS:  0000000000000000(0000) GS:ffff88016ec00000(0000) knlGS:0000000000000000
[106150.362451] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[106150.362591] CR2: ffffffffa01b73e6 CR3: 0000000001809000 CR4: 00000000003406f0
[106150.362734] Call Trace:
[106150.362872]  ? md_write_inc+0x2d/0x2d
[106150.363010]  ? kthread+0xf4/0xfc
[106150.363147]  rewind_stack_do_exit+0x17/0x20
[106150.363284] Code: 48 39 c8 75 22 48 8b 70 10 48 8d 48 10 48 39 ce 75 15 48 8b 50 20 48 83 c0 20 48 39 c2 0f 95 c2 0f b6 d2 eb 02 31 d2 85 d2 74 02 <0f> ff 65 8b 05 63 8b fc 7e a9 00 ff 1f 00 48 c7 c7 1f 45 70 81
[106150.363558] ---[ end trace ce30e240b8ef7ab8 ]---
----------------------------------------------------------

HW setup is rather basic (only onboard stuff) and the discs report not smart issues

----------------------------------------------------------
lspci -vvnn
00:00.0 Host bridge [0600]: Intel Corporation Device [8086:590f] (rev 06)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
        Latency: 0
        Capabilities: [e0] Vendor Specific Information: Len=10 <?>

00:02.0 VGA compatible controller [0300]: Intel Corporation Device [8086:5902] (rev 04) (prog-if 00 [VGA controller])
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at de000000 (64-bit, non-prefetchable) [size=16M]
        Region 2: Memory at c0000000 (64-bit, prefetchable) [size=256M]
        Region 4: I/O ports at f000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0
                        ExtTag- RBE+
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
        Capabilities: [ac] MSI: Enable- Count=1/1 Maskable- 64bit-
                Address: 00000000  Data: 0000
        Capabilities: [d0] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Process Address Space ID (PASID)
                PASIDCap: Exec- Priv-, Max PASID Width: 14
                PASIDCtl: Enable- Exec- Priv-
        Capabilities: [200 v1] Address Translation Service (ATS)
                ATSCap: Invalidate Queue Depth: 00
                ATSCtl: Enable-, Smallest Translation Unit: 00
        Capabilities: [300 v1] Page Request Interface (PRI)
                PRICtl: Enable- Reset-
                PRISta: RF- UPRGI- Stopped+
                Page Request Capacity: 00008000, Page Request Allocation: 00000000

00:08.0 System peripheral [0880]: Intel Corporation Skylake Gaussian Mixture Model [8086:1911]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at df12f000 (64-bit, non-prefetchable) [size=4K]
        Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
                Address: 00000000  Data: 0000
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [f0] PCI Advanced Features
                AFCap: TP+ FLR+
                AFCtrl: FLR-
                AFStatus: TP-

00:14.0 USB controller [0c03]: Intel Corporation Device [8086:a2af] (prog-if 30 [XHCI])
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 121
        Region 0: Memory at df110000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: [70] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
                Address: 00000000fee0300c  Data: 4171
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci

00:14.2 Signal processing controller [1180]: Intel Corporation Device [8086:a2b1]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin C routed to IRQ 11
        Region 0: Memory at df12e000 (64-bit, non-prefetchable) [size=4K]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
                Address: 00000000  Data: 0000

00:16.0 Communication controller [0780]: Intel Corporation Device [8086:a2ba]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at df12d000 (64-bit, non-prefetchable) [disabled] [size=4K]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [8c] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000

00:17.0 SATA controller [0106]: Intel Corporation Device [8086:a282] (prog-if 01 [AHCI 1.0])
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 120
        Region 0: Memory at df128000 (32-bit, non-prefetchable) [size=8K]
        Region 1: Memory at df12c000 (32-bit, non-prefetchable) [size=256]
        Region 2: I/O ports at f090 [size=8]
        Region 3: I/O ports at f080 [size=4]
        Region 4: I/O ports at f060 [size=32]
        Region 5: Memory at df12b000 (32-bit, non-prefetchable) [size=2K]
        Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee0100c  Data: 4161
        Capabilities: [70] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
        Kernel driver in use: ahci

00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:a296] (rev f0) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin C routed to IRQ 18
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: 0000e000-0000efff
        Memory behind bridge: df000000-df0fffff
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0
                        ExtTag- RBE+
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #7, Speed 8GT/s, Width x1, ASPM not supported, Exit Latency L0s <1us, L1 <16us
                        ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
                SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
                        Slot #10, PowerLimit 10.000W; Interlock- NoCompl+
                SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
                        Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
                SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
                        Changed: MRL- PresDet- LinkState+
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
                RootCap: CRSVisible-
                RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                DevCap2: Completion Timeout: Range ABC, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd+
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
                Address: 00000000  Data: 0000
        Capabilities: [90] Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Capabilities: [a0] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
        Capabilities: [140 v1] Access Control Services
                ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
                ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
        Capabilities: [220 v1] #19

00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:a2c8]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0

00:1f.2 Memory controller [0580]: Intel Corporation Device [8086:a2a1]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Region 0: Memory at df124000 (32-bit, non-prefetchable) [size=16K]

00:1f.3 Audio device [0403]: Intel Corporation Device [8086:a2f0]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:fa74]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 32, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at df120000 (64-bit, non-prefetchable) [size=16K]
        Region 4: Memory at df100000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [60] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000

00:1f.4 SMBus [0c05]: Intel Corporation Device [8086:a2a3]
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at df12a000 (64-bit, non-prefetchable) [size=256]
        Region 4: I/O ports at f040 [size=32]
        Kernel driver in use: i801_smbus
        Kernel modules: i2c_i801

01:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:7a74]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 122
        Region 0: I/O ports at e000 [size=256]
        Region 2: Memory at df004000 (64-bit, non-prefetchable) [size=4K]
        Region 4: Memory at df000000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee0200c  Data: 4191
        Capabilities: [70] Express (v2) Endpoint, MSI 01
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via message/WAKE#
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
                Vector table: BAR=4 offset=00000000
                PBA: BAR=4 offset=00000800
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [140 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed- WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
                        Status: NegoPending- InProgress-
        Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
        Capabilities: [170 v1] Latency Tolerance Reporting
                Max snoop latency: 3145728ns
                Max no snoop latency: 3145728ns
        Capabilities: [178 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
                          PortCommonModeRestoreTime=150us PortTPowerOnTime=150us
        Kernel driver in use: r8169
        Kernel modules: r8169
----------------------------------------------------------
Comment 1 Shaohua Li 2017-07-10 18:11:49 UTC
could you please try the patch in http://marc.info/?l=linux-raid&m=149967155905878&w=2?
Comment 2 Patrick 2017-07-11 09:48:49 UTC
That patch works nicely. I just finished two concurrent raid checks plus some writing without any issues.
Comment 3 Shaohua Li 2017-07-17 16:38:05 UTC
there is a new patch which I'll apply to upstream. It should fix the problem too, but if you can test, we have more confidence. Thanks!
http://marc.info/?l=linux-raid&m=150002013125258&w=2

Note You need to log in before you can comment on or make changes to this bug.