Bug 6521 - oops in libata-core.c:ata_pio_poll, qc == NULL
oops in libata-core.c:ata_pio_poll, qc == NULL
Status: CLOSED PATCH_ALREADY_AVAILABLE
Product: IO/Storage
Classification: Unclassified
Component: Serial ATA
i386 Linux
: P2 low
Assigned To: Jeff Garzik
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-05-08 22:21 UTC by Domen Puncer
Modified: 2006-05-09 09:22 UTC (History)
0 users

See Also:
Kernel Version: 2.6.16.9
Tree: Mainline
Regression: ---


Attachments

Description Domen Puncer 2006-05-08 22:21:26 UTC
Most recent kernel where this bug did not occur: N/A
Distribution: Slackware 10.2
Hardware Environment:
00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID
Controller (rev 80)
        Subsystem: ASUSTeK Computer Inc. A7V600/K8V Deluxe/K8V-X motherboard
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 64
        Interrupt: pin B routed to IRQ 11
        Region 0: I/O ports at e400 [size=8]
        Region 1: I/O ports at e000 [size=4]
        Region 2: I/O ports at d800 [size=8]
        Region 3: I/O ports at d400 [size=4]
        Region 4: I/O ports at d000 [size=16]
        Region 5: I/O ports at c800 [size=256]
        Capabilities: [c0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-


Software Environment:
raid1, if it matters

Problem Description:
After a few days of uptime, I got following in my log:
May  6 18:40:20 k kernel: ata2: command 0xea timeout, stat 0xd0 host_stat 0x0
May  6 18:40:20 k kernel: ata2: translated ATA stat/err 0xd0/00 to SCSI
SK/ASC/ASCQ 0xb/47/00
May  6 18:40:20 k kernel: ata2: status=0xd0 { Busy }
May  6 18:40:20 k kernel: ATA: abnormal status 0xD0 on port 0xD807
May  6 18:40:20 k last message repeated 2 times
May  6 18:40:50 k kernel: ata2: command 0x35 timeout, stat 0xd0 host_stat 0x1
May  6 18:40:50 k kernel: ata2: translated ATA stat/err 0xd0/00 to SCSI
SK/ASC/ASCQ 0xb/47/00
May  6 18:40:50 k kernel: ata2: status=0xd0 { Busy }
May  6 18:40:50 k kernel: end_request: I/O error, dev sdb, sector 58604991
May  6 18:40:50 k kernel: ATA: abnormal status 0xD0 on port 0xD807
May  6 18:40:50 k last message repeated 2 times
May  6 18:41:20 k kernel: ata2: command 0xea timeout, stat 0xd0 host_stat 0x0
May  6 18:41:20 k kernel: ata2: translated ATA stat/err 0xd0/00 to SCSI
SK/ASC/ASCQ 0xb/47/00
May  6 18:41:20 k kernel: ata2: status=0xd0 { Busy }
May  6 18:41:20 k kernel: raid1: Disk failure on sdb1, disabling device. 
May  6 18:41:20 k kernel: ^IOperation continuing on 1 devices

Today, I noticed this, so I tried smartctl -d ata -a /dev/sdb and fdisk
/dev/sdb, and got following oops:
May  9 06:47:07 k kernel: ATA: abnormal status 0xD0 on port 0xD807
May  9 06:47:07 k last message repeated 3 times
May  9 06:47:17 k kernel: ata2: command 0xec timeout, stat 0xd0 host_stat 0x0
May  9 06:47:17 k kernel: ata2: translated ATA stat/err 0xd0/00 to SCSI
SK/ASC/ASCQ 0xb/47/00
May  9 06:47:17 k kernel: ata2: status=0xd0 { Busy }
May  9 06:47:17 k kernel: ATA: abnormal status 0xD0 on port 0xD807
May  9 06:47:17 k last message repeated 2 times
May  9 06:47:27 k kernel: ata2: command 0xa1 timeout, stat 0xd0 host_stat 0x0
May  9 06:47:27 k kernel: ata2: translated ATA stat/err 0xd0/00 to SCSI
SK/ASC/ASCQ 0xb/47/00
May  9 06:47:27 k kernel: ata2: status=0xd0 { Busy }
May  9 06:47:27 k kernel: Assertion failed! qc !=
NULL,/root/linux-2.6.16.9/drivers/scsi/libata-core.c,ata_pio_poll,line=2897
May  9 06:47:47 k last message repeated 1253 times
May  9 06:47:47 k kernel: Unable to handle kernel NULL pointer dereference at
virtual address 0000008c
May  9 06:47:47 k kernel:  printing eip:
May  9 06:47:47 k kernel: c025e96c
May  9 06:47:47 k kernel: *pde = 00000000
May  9 06:47:47 k kernel: Oops: 0002 [#1]
May  9 06:47:47 k kernel: Modules linked in: bsd_comp ppp_synctty ppp_async
crc_ccitt ppp_generic slhc ipv6 ohci_hcd amd64_agp shpchp 8139too uhci_hcd
ehci_hcd via_rhine mii ext3 jbd w83627ehf hwmon eeprom i2c_is
a i2c_viapro i2c_core ide_scsi agpgart quota_v2
May  9 06:47:47 k kernel: CPU:    0
May  9 06:47:47 k kernel: EIP:    0060:[<c025e96c>]    Not tainted VLI
May  9 06:47:47 k kernel: EFLAGS: 00010297   (2.6.16.9 #4) 
May  9 06:47:47 k kernel: EIP is at ata_pio_poll+0x8c/0x100
May  9 06:47:47 k kernel: eax: 0de5fe1d   ebx: f7487280   ecx: c03f4bcc   edx:
0000d807
May  9 06:47:47 k kernel: esi: 00000000   edi: 00000004   ebp: 00000002   esp:
f7563eec
May  9 06:47:47 k kernel: ds: 007b   es: 007b   ss: 0068
May  9 06:47:47 k kernel: Process ata/0 (pid: 801, threadinfo=f7562000
task=f7f6f550)
May  9 06:47:47 k kernel: Stack: <0>f7487280 c03b081d c03c4140 c037fb37 00000b51
c048b5b0 f7487280 00000000 
May  9 06:47:47 k kernel:        f7f96440 f7487280 c025f30c f7487280 f7487830
00000296 c0128342 f7487280 
May  9 06:47:47 k kernel:        00000000 a1e3e100 f7f96458 f7f96448 c025f2c0
f7f96450 f7563f88 f7f96448 
May  9 06:47:47 k kernel: Call Trace:
May  9 06:47:47 k kernel:  [<c025f30c>] ata_pio_task+0x4c/0x80
May  9 06:47:47 k kernel:  [<c0128342>] run_workqueue+0x62/0xd0
May  9 06:47:47 k kernel:  [<c025f2c0>] ata_pio_task+0x0/0x80
May  9 06:47:47 k kernel:  [<c01284f0>] worker_thread+0x140/0x170
May  9 06:47:47 k kernel:  [<c0115f20>] default_wake_function+0x0/0x20
May  9 06:47:47 k kernel:  [<c0115f20>] default_wake_function+0x0/0x20
May  9 06:47:47 k kernel:  [<c01283b0>] worker_thread+0x0/0x170
May  9 06:47:47 k kernel:  [<c012b831>] kthread+0xb1/0xc0
May  9 06:47:47 k kernel:  [<c012b780>] kthread+0x0/0xc0
May  9 06:47:47 k kernel:  [<c0101391>] kernel_thread_helper+0x5/0x14
May  9 06:47:47 k kernel: Code: 78 1c 89 bb dc 05 00 00 31 c0 8b 5c 24 18 8b 74
24 1c 8b 7c 24 20 8b 6c 24 24 83 c4 28 c3 a1 f8 5c 3f c0 39 83 e0 05 00 00 79 13
<83> 8e 8c 00 00 00 04 c7 83 dc 05 00 00 03 00 00 0
0 eb ca b8 04 


Steps to reproduce:
Comment 1 Tejun Heo 2006-05-08 22:56:47 UTC
This bug is fixed by the following commit, which is in libata-dev #upstream and
Linus's tree.  I don't know whether backporting the fix to -stable is necessary
though.

Author: Tejun Heo <htejun@gmail.com>
Date:   Sun Feb 12 23:32:59 2006 +0900
[tj@htj:~/os/linux-2.6]$ git-cat-file commit 86e45b6
tree 5b86ebd0b0b17d05bdfdd07b7683f7348577b52a
parent d7fc3ca1cd0ecce82263299c6b1631fc83b0ec79
author Tejun Heo <htejun@gmail.com> 1141540149 +0900
committer Jeff Garzik <jeff@garzik.org> 1142117840 -0500

[PATCH] libata: implement port_task

Implement port_task.  LLDD's can schedule a function to be executed
with context after specified delay.  libata core takes care of
synchronization against EH.  This is generalized form of pio_task and
packet_task which are tied to PIO hsm implementation.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Comment 2 Domen Puncer 2006-05-09 09:22:47 UTC
Oh, only looked into latest -stable.
Sorry for the noise.

Thanks!

Note You need to log in before you can comment on or make changes to this bug.