Bug 109661

Summary: Weird errors accessing the disk under kernel 4.3.3
Product: IO/Storage Reporter: Artem S. Tashkinov (aros)
Component: Block LayerAssignee: Jens Axboe (axboe)
Status: RESOLVED CODE_FIX    
Severity: blocking CC: ArAge, marci_r, nholland, torvalds
Priority: P1    
Hardware: i386   
OS: Linux   
Kernel Version: 4.3.3 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: dmesg
Various information including .config dmesg and /proc files
/proc and dmesg for 4.1.15
All "bad" dmesg's and bisect log
blk-core.fix.blk_queue_split.c.patch
attachment-15407-0.html

Description Artem S. Tashkinov 2015-12-20 03:08:33 UTC
Created attachment 197801 [details]
dmesg

Right after booting when I try to start Firefox I see this:

[  214.464060] ata1.00: exception Emask 0x2 SAct 0x600 SErr 0x3040400 action 0x6
[  214.464065] ata1.00: irq_stat 0x45000008
[  214.464069] ata1: SError: { Proto CommWake TrStaTrns UnrecFIS }
[  214.464072] ata1.00: failed command: READ FPDMA QUEUED
[  214.464078] ata1.00: cmd 60/e0:48:20:c8:22/07:00:04:00:00/40 tag 9 ncq 1032192 in
[  214.464078]          res 41/84:e0:20:c8:22/00:07:04:00:00/00 Emask 0x412 (ATA bus error) <F>
[  214.464081] ata1.00: status: { DRDY ERR }
[  214.464083] ata1.00: error: { ICRC ABRT }
[  214.464085] ata1.00: failed command: READ FPDMA QUEUED
[  214.464091] ata1.00: cmd 60/88:50:a0:5c:21/00:00:04:00:00/40 tag 10 ncq 69632 in
[  214.464091]          res 40/00:50:a0:5c:21/00:00:04:00:00/40 Emask 0x2 (HSM violation)
[  214.464093] ata1.00: status: { DRDY }
[  214.464098] ata1: hard resetting link
[  214.783419] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  214.783814] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150818/psargs-359)
[  214.783822] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ef02c720), AE_NOT_FOUND (20150818/psparse-542)
[  214.784551] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20150818/psargs-359)
[  214.784559] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ef02c720), AE_NOT_FOUND (20150818/psparse-542)
[  214.784789] ata1.00: configured for UDMA/133
[  214.784894] ata1: EH complete

This didn't happen under Linux 4.1.10.

The previous boot was even more weird as it contained hundreds of errors like these:

Corrupted low memory at c0001000 (1000 phys) = 6a000002
Corrupted low memory at c0001004 (1004 phys) = 501f6a01
Corrupted low memory at c0001008 (1008 phys) = fd1ecde8
Corrupted low memory at c000100c (100c phys) = 10c483ff
Corrupted low memory at c0001010 (1010 phys) = dae9c931
Corrupted low memory at c0001014 (1014 phys) = 8b000000
Corrupted low memory at c0001018 (1018 phys) = 3c83807d

I'm attaching the log from my previous boot.

My HDD doesn't have any errors because I've run badblocks on it and no problems were reported.

I've no idea what is wrong with kernel 4.3.3 but it basically unusable here. 4.1.10 worked just fine.

HW configuration:

Intel Core i5 2500
ASUS P8P67 Pro
16GB DDR3 RAM
Linux Kernel i686 PAE
Comment 1 Artem S. Tashkinov 2015-12-20 03:13:15 UTC
Created attachment 197811 [details]
Various information including .config dmesg and /proc files
Comment 2 Artem S. Tashkinov 2015-12-20 03:28:27 UTC
I've rebooted into 4.1.15 - no errors, everything is back to normal.
Comment 3 Artem S. Tashkinov 2015-12-20 03:30:50 UTC
Created attachment 197821 [details]
/proc and dmesg for 4.1.15
Comment 4 Artem S. Tashkinov 2015-12-20 03:39:31 UTC
4.1.15 vs 4.3.3:

dmesg.diff
-Linux version 4.1.15-ic (root@sysresccd) (gcc version 4.7.4 (birdie) (GCC) ) #1 SMP PREEMPT Sun Dec 20 13:23:05 YEKT 2015
+Linux version 4.3.3-ic (root@localhost.localdomain) (gcc version 4.7.4 (birdie) (GCC) ) #1 SMP PREEMPT Sat Dec 19 08:29:41 YEKT 2015
+x86/fpu: xstate_offset[2]: 0240, xstate_sizes[2]: 0100
+x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers'
+x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers'
+x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers'
+x86/fpu: Enabled xstate features 0x7, context size is 0x340 bytes, using 'standard' format.
+x86/fpu: Using 'eager' FPU context switches.


-PAT configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- UC..
+x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT..


-initial memory mapped: [mem 0x00000000-0x1e9fffff]
+initial memory mapped: [mem 0x00000000-0x133fffff]


-BRK [0x1e5af000, 0x1e5affff] PGTABLE
-BRK [0x1e5b0000, 0x1e5b0fff] PGTABLE
+BRK [0x12e79000, 0x12e79fff] PGTABLE
+BRK [0x12e7a000, 0x12e7afff] PGTABLE


-free_area_init_node: node 0, pgdat de3c4480, node_mem_map ef81e020
+free_area_init_node: node 0, pgdat d2ce5f80, node_mem_map ef81e020


-PERCPU: Embedded 17 pages/cpu @ef7c7000 s40588 r0 d29044 u69632
-pcpu-alloc: s40588 r0 d29044 u69632 alloc=17*4096
+PERCPU: Embedded 17 pages/cpu @ef7c7000 s38924 r0 d30708 u69632
+pcpu-alloc: s38924 r0 d30708 u69632 alloc=17*4096


xsave: enabled xstate_bv 0x7, cntxt size 0x340 using standard form
 Initializing HighMem for node 0 (00037bfe:0041f000)
 Initializing Movable for node 0 (00000000:00000000)
-Memory: 16600456K/16743568K available (3680K kernel code, 204K rwdata, 1016K rodata, 440K init, 488K bss, 143112K reserved, 0K cma-reserved, 15830568K highmem)
+Memory: 16600672K/16743568K available (3761K kernel code, 230K rwdata, 1048K rodata, 440K init, 500K bss, 142896K reserved, 0K cma-reserved, 15830568K highmem)
 virtual kernel memory layout:
     fixmap  : 0xfff17000 - 0xfffff000   ( 928 kB)
     pkmap   : 0xffc00000 - 0xffe00000   (2048 kB)
     vmalloc : 0xf83fe000 - 0xffbfe000   ( 120 MB)
     lowmem  : 0xc0000000 - 0xf7bfe000   ( 891 MB)
-      .init : 0xde3cd000 - 0xde43b000   ( 440 kB)
-      .data : 0xde298724 - 0xde3cb1c0   (1226 kB)
-      .text : 0xddf00000 - 0xde298724   (3681 kB)
+      .init : 0xd2cef000 - 0xd2d5d000   ( 440 kB)
+      .data : 0xd2bacaf6 - 0xd2cedbc0   (1284 kB)
+      .text : 0xd2800000 - 0xd2bacaf6   (3762 kB)


-RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
+RCU: Adjusting geometry for rcu_fanout_leaf=32, nr_cpu_ids=4


-Freeing SMP alternatives memory: 16K (de43b000 - de43f000)
+Freeing SMP alternatives memory: 16K (d2d5d000 - d2d61000)


meminfo.diff:
-MemTotal:       16600912 kB
-MemFree:        15671900 kB
-MemAvailable:   16026928 kB
-Buffers:           46780 kB
-Cached:           353420 kB
+MemTotal:       16601128 kB
+MemFree:        14803640 kB
+MemAvailable:   15328176 kB
+Buffers:           54592 kB
+Cached:           674460 kB
 SwapCached:            0 kB
-Active:           587876 kB
-Inactive:         270432 kB
-Active(anon):     460772 kB
-Inactive(anon):     4120 kB
-Active(file):     127104 kB
-Inactive(file):   266312 kB
+Active:          1289268 kB
+Inactive:         400448 kB
+Active(anon):    1056304 kB
+Inactive(anon):    72184 kB
+Active(file):     232964 kB
+Inactive(file):   328264 kB
 Unevictable:           0 kB
 Mlocked:               0 kB
 HighTotal:      15830568 kB
-HighFree:       15030096 kB
-LowTotal:         770344 kB
-LowFree:          641804 kB
+HighFree:       14247628 kB
+LowTotal:         770560 kB
+LowFree:          556012 kB
 SwapTotal:             0 kB
 SwapFree:              0 kB
-Dirty:                28 kB
+Dirty:                44 kB
 Writeback:             0 kB
-AnonPages:        458092 kB
-Mapped:           183880 kB
-Shmem:              6800 kB
-Slab:              25340 kB
-SReclaimable:      11940 kB
-SUnreclaim:        13400 kB
-KernelStack:        2264 kB
-PageTables:         5476 kB
+AnonPages:        960660 kB
+Mapped:           308236 kB
+Shmem:            167840 kB
+Slab:              31740 kB
+SReclaimable:      15416 kB
+SUnreclaim:        16324 kB
+KernelStack:        3376 kB
+PageTables:        13268 kB
 NFS_Unstable:          0 kB
 Bounce:                0 kB
 WritebackTmp:          0 kB
-CommitLimit:     8300456 kB
-Committed_AS:    1258724 kB
+CommitLimit:     8300564 kB
+Committed_AS:    5356632 kB
 VmallocTotal:     122880 kB
-VmallocUsed:       54196 kB
+VmallocUsed:       54512 kB
 VmallocChunk:      65524 kB
-AnonHugePages:    143360 kB
-DirectMap4k:       49144 kB
-DirectMap2M:      864256 kB
+AnonHugePages:    268288 kB
+DirectMap4k:       88056 kB
+DirectMap2M:      825344 kB


mttr.diff:
 reg04: base=0x200000000 ( 8192MB), size= 8192MB, count=1: write-back
 reg05: base=0x400000000 (16384MB), size=  512MB, count=1: write-back
 reg06: base=0x41f000000 (16880MB), size=   16MB, count=1: uncachable
-reg07: base=0x0f1000000 ( 3856MB), size=   16MB, count=1: write-combining
Comment 5 Artem S. Tashkinov 2015-12-20 03:41:08 UTC
It looks like kernel 4.3 (and 4.2 might as well - I haven't tested it) has its memory management in case of PAE 16GB RAM totally borked up.

I haven't checked kernel 4.4.

Please advise.
Comment 6 Linus Torvalds 2015-12-20 03:54:15 UTC
Can you try to figure out where it started happening?

I din't think we've seen this kind of report before, and 4.2 is used by most distributions by now, so I'd have expected to hear about it if people saw it.

So please test 4.2, to see if that works.

It would be even better if you could then bisect things. "git bisect" isn't that hard to use, and is very efficient at hunting down when bugs start happening if those bugs are easy to reproduce and aren't timing-related.
Comment 7 Artem S. Tashkinov 2015-12-20 07:58:11 UTC
You were right - kernel 4.2.x is fine.

# git bisect bad
b54ffb73cadcdcff9cc1ae0e11f502407e3e2e4c is the first bad commit
commit b54ffb73cadcdcff9cc1ae0e11f502407e3e2e4c
Author: Kent Overstreet <kent.overstreet@gmail.com>
Date:   Tue May 19 14:31:01 2015 +0200

    block: remove bio_get_nr_vecs()

    We can always fill up the bio now, no need to estimate the possible
    size based on queue parameters.

    Acked-by: Steven Whitehouse <swhiteho@redhat.com>
    Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
    [hch: rebased and wrote a changelog]
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
    Signed-off-by: Jens Axboe <axboe@fb.com>

:040000 040000 3ae241a95c9928fa295f11632d9842643e325802 b1375609d88a0700577f22e52d4991ed5188570b M      block
:040000 040000 f713ab61fa6b39cadb179bec7c881ecaab5ecd45 d58096c89e217ca8624c60282cf3fa7727707116 M      drivers
:040000 040000 bc01b8ef2831851662533bc06f319d6cc375ec89 2080b43bdd2ac409dd56d7dcdeefcac957ec5d42 M      fs
:040000 040000 761518c382b387766fcdfd6b590813967a26e812 2641a966056d4184fe02ca2ee14576481810ed32 M      include

P.S. It has taken me more than 21 reboots, 2,5+ hours (I couldn't build the kernel after finding bad commits which have been six or seven) and 15 compilations.
Comment 8 Artem S. Tashkinov 2015-12-20 08:04:01 UTC
Created attachment 197841 [details]
All "bad" dmesg's and bisect log
Comment 9 Linus Torvalds 2015-12-20 17:21:44 UTC
(In reply to Artem S. Tashkinov from comment #7)
> 
> P.S. It has taken me more than 21 reboots, 2,5+ hours (I couldn't build the
> kernel after finding bad commits which have been six or seven) and 15
> compilations.

Thanks for the bisect. And yes, bisection isn't necessarily painfree, but considering that there's about 13 *thousand* commits between 4.2 and 4.3, doing about 20 reboots is still pretty efficient.

I'll talk to the guilty parties. Thanks again. - Linus
Comment 10 Nils Holland 2015-12-21 11:54:01 UTC
For what it's worth, I've been seeing what I believe to be similar issues here as well. What I can say with absolute certainty is that every kernel up to (and including) 4.3.2 was fine. However, with 4.3.3 I get strange ata-related issues in the logs, very early after booting:

[ 1711.670795] ata1.00: exception Emask 0x0 SAct 0x500378 SErr 0x0 action 0x6 frozen
[ 1711.670812] ata1.00: failed command: READ FPDMA QUEUED
[ 1711.670830] ata1.00: cmd 60/20:18:c0:df:23/00:00:03:00:00/40 tag 3 ncq 16384 in
[ 1711.670830]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 1711.670839] ata1.00: status: { DRDY }
[ 1711.670846] ata1.00: failed command: WRITE FPDMA QUEUED
[ 1711.670860] ata1.00: cmd 61/00:20:68:fe:19/08:00:16:00:00/40 tag 4 ncq 1048576 out
[ 1711.670860]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 1711.670867] ata1.00: status: { DRDY }
[ 1711.670873] ata1.00: failed command: WRITE FPDMA QUEUED
[ 1711.670886] ata1.00: cmd 61/80:28:68:06:1a/05:00:16:00:00/40 tag 5 ncq 720896 out
[ 1711.670886]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 1711.670893] ata1.00: status: { DRDY }
[ 1711.670898] ata1.00: failed command: WRITE FPDMA QUEUED
[ 1711.670912] ata1.00: cmd 61/80:30:e8:0b:1a/02:00:16:00:00/40 tag 6 ncq 327680 out
[ 1711.670912]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 1711.670918] ata1.00: status: { DRDY }
[ 1711.670924] ata1.00: failed command: WRITE FPDMA QUEUED
[ 1711.670937] ata1.00: cmd 61/58:40:68:0e:1a/05:00:16:00:00/40 tag 8 ncq 700416 out
[ 1711.670937]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 1711.670944] ata1.00: status: { DRDY }
[ 1711.670949] ata1.00: failed command: WRITE FPDMA QUEUED
[ 1711.670962] ata1.00: cmd 61/80:48:88:4a:1a/03:00:16:00:00/40 tag 9 ncq 458752 out
[ 1711.670962]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 1711.670969] ata1.00: status: { DRDY }
[ 1711.670975] ata1.00: failed command: READ FPDMA QUEUED
[ 1711.670988] ata1.00: cmd 60/30:a0:38:6e:45/00:00:12:00:00/40 tag 20 ncq 24576 in
[ 1711.670988]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 1711.670996] ata1.00: status: { DRDY }
[ 1711.671001] ata1.00: failed command: READ FPDMA QUEUED
[ 1711.671015] ata1.00: cmd 60/08:b0:90:6e:6c/00:00:15:00:00/40 tag 22 ncq 4096 in
[ 1711.671015]          res 40/00:b0:b0:8c:6c/00:00:15:00:00/40 Emask 0x4 (timeout)
[ 1711.671022] ata1.00: status: { DRDY }
[ 1711.671032] ata1: hard resetting link
[ 1711.975808] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 1711.979935] ata1.00: configured for UDMA/133
[ 1711.979955] ata1.00: device reported invalid CHS sector 0
[ 1711.979963] ata1.00: device reported invalid CHS sector 0
[ 1711.979970] ata1.00: device reported invalid CHS sector 0
[ 1711.979975] ata1.00: device reported invalid CHS sector 0
[ 1711.979981] ata1.00: device reported invalid CHS sector 0
[ 1711.979986] ata1.00: device reported invalid CHS sector 0
[ 1711.979992] ata1.00: device reported invalid CHS sector 0
[ 1711.980023] ata1: EH complete

The system actually runs more of less normally, but occasionally locks up for half a minute or so. During the couple of days that this has been going on (and before I really noticed it), two files actually happened to become corrupt on the btrfs filesystem contained on the disk.

Interestingly, after a whole load of these errors, I get this line:

[ 1774.651390] ata1.00: NCQ disabled due to excessive errors

And after that, the system will run fine without any additional errors or otherwise noticeable problems for hours.

Unfortunately, I'm on a vacation and can't test much. I don't even know whether my issues are related to Artem's, but trying to boot without NCQ might be a workaround (that I myself still have to actually test) and then there's the question of 4.3 does really already exhibit the symptoms in Artem's case, because at least in my case, 4.3 was definitely fine and issues only started with 4.3.3...
Comment 11 Artem S. Tashkinov 2015-12-22 06:03:09 UTC
Created attachment 197971 [details]
blk-core.fix.blk_queue_split.c.patch

(In reply to Nils Holland from comment #10)
> For what it's worth, I've been seeing what I believe to be similar issues
> here as well. 

Try this patch and if it doesn't fix the issue for you, you might have a different problem.
Comment 12 Jens Axboe 2015-12-22 17:29:17 UTC
Merged the fix.
Comment 13 Nils Holland 2015-12-22 19:56:56 UTC
(In reply to Artem S. Tashkinov from comment #11)
> Created attachment 197971 [details]
> blk-core.fix.blk_queue_split.c.patch
> 
> Try this patch and if it doesn't fix the issue for you, you might have a
> different problem.

I can confirm that this patch fixes my issue well. I've applied it to my 4.3.3 sources which previously would only work reliably with NCQ turned off and emitted all kinds of errors (see comment #10) with it turned on. Now, with this patch applied, I can turn NCQ back on and no longer see anything weird in dmesg.
Comment 14 Artem S. Tashkinov 2015-12-23 00:25:05 UTC
Hooray!

I wonder how on Earth I was the only person to spot this bug and report it. I'm sorry for the noise I'm making here but I'm really curious.
Comment 15 Linus Torvalds 2015-12-23 00:56:15 UTC
Created attachment 198041 [details]
attachment-15407-0.html

On Dec 22, 2015 4:25 PM, <bugzilla-daemon@bugzilla.kernel.org> wrote:
>
> I wonder how on Earth I was the only person to spot this bug and report
it. I'm
> sorry for the noise I'm making here but I'm really curious.

I suspect that most people who upgrade to new kernels have long since moved
to x86-64, so the PAE cases get very little testing..

Anyway, thanks for the report and bisect and testing..

    Linus
Comment 16 Rene van der Helm 2015-12-31 12:14:08 UTC
Hello!

First of all I like to mention I'm just an end-user with no programming skills and kernel stuff is all new to me. I'm just a home-server maintainer trying to maintain my home network.

But I ran into a lot of problems a few days ago, the same problems as described above. And until now (duh) I couldn't find any information concerning this bug so i figured it out myself. 

It cost me three whole days, two new harddrives and a lot of frustration to solve this matter but I got it running and learnt a lot.

But I like to address my concerns and I have a question. How long does it take before a new kernel version is available for the Debian testing branch? I ask this because a lot of people, like me, will have the same problems when they update Debian. Debian testing (https://packages.debian.org/testing/kernel/) still uses 4.3.0.1 as of 31-12-2015. 

FYI: At the moment I'm using Linux debian 3.16.0-4-686-pae #1 SMP Debian 3.16.7-ckt20-1+deb8u1 (2015-12-14) i686 GNU/Linux and that works fine.

Best regards
Rene van der Helm
Comment 17 Artem S. Tashkinov 2015-12-31 12:27:54 UTC
You should directly ask Debian maintainers who (as far as I know) don't frequent kernel's bugzilla.

Also I guess it will help a lot if you open a bug report at Debian's bugzilla and reference this bug - they might include the fix faster.
Comment 18 Rene van der Helm 2016-01-01 11:17:12 UTC
Done, sent an email to the Debian Kernel Team. Thanks for your advice and wish you a Happy New Year!
Comment 19 Artem S. Tashkinov 2016-01-23 22:45:21 UTC
Strangely kernel 4.3.4, which has just been released, doesn't include the fix for this issue.