Bug 16627 - sdhci shows backtrace while resuming after s2disk
Summary: sdhci shows backtrace while resuming after s2disk
Status: RESOLVED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: MMC/SD (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_mmc-sd
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-19 10:01 UTC by Oleksandr Natalenko
Modified: 2012-08-13 15:55 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.35.2
Subsystem:
Regression: No
Bisected commit-id:


Attachments
config, lspci and dmesg (33.96 KB, application/x-bzip)
2010-08-19 10:01 UTC, Oleksandr Natalenko
Details
lspci before and after resume (3.05 KB, application/x-bzip)
2010-08-20 12:20 UTC, Oleksandr Natalenko
Details

Description Oleksandr Natalenko 2010-08-19 10:01:09 UTC
Created attachment 27506 [details]
config, lspci and dmesg

The following strings appear in dmesg after resuming:

===
[  199.151739] Call Trace:
[  199.151749]  [<c1073734>] ? __report_bad_irq+0x24/0x90
[  199.151753]  [<c10738f0>] ? note_interrupt+0x150/0x190
[  199.151757]  [<c10751f1>] ? move_native_irq+0x11/0x50
[  199.151762]  [<c107413b>] ? handle_fasteoi_irq+0xab/0xd0
[  199.151765]  [<c1074090>] ? handle_fasteoi_irq+0x0/0xd0
[  199.151768]  <IRQ>  [<c1004657>] ? do_IRQ+0x47/0xc0
[  199.151775]  [<c10030e9>] ? common_interrupt+0x29/0x30
[  199.151780]  [<c103007b>] ? __sched_setscheduler+0x26b/0x400
[  199.151786]  [<c122f581>] ? acpi_idle_enter_c1+0x9d/0xb2
[  199.151792]  [<c135e586>] ? cpuidle_idle_call+0x76/0xe0
[  199.151795]  [<c1001bef>] ? cpu_idle+0x3f/0x90
[  199.151800]  [<c15ee8d7>] ? start_kernel+0x2e9/0x2ee
[  199.151804]  [<c15ee42c>] ? unknown_bootoption+0x0/0x190
[  199.151806] handlers:
[  199.151808] [<f8bfac50>] (r852_irq+0x0/0x250 [r852])
[  199.151820] [<c136a080>] (sdhci_irq+0x0/0x570)
===
...
===
[  208.256014] mmc0: Timeout waiting for hardware interrupt.
[  208.256019] sdhci: ============== REGISTER DUMP ==============
[  208.256025] sdhci: Sys addr: 0x00000000 | Version:  0x00000400
[  208.256030] sdhci: Blk size: 0x00000000 | Blk cnt:  0x00000000
[  208.256035] sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[  208.256040] sdhci: Present:  0x01f70000 | Host ctl: 0x00000001
[  208.256044] sdhci: Power:    0x0000000f | Blk gap:  0x00000000
[  208.256049] sdhci: Wake-up:  0x00000000 | Clock:    0x00004007
[  208.256054] sdhci: Timeout:  0x00000000 | Int stat: 0x00000000
[  208.256058] sdhci: Int enab: 0x00ff00c3 | Sig enab: 0x00ff00c3
[  208.256063] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000001
[  208.256068] sdhci: Caps:     0x00c02120 | Max curr: 0x00000040
[  208.256070] sdhci: ===========================================
===

lspci, dmesg and config are enclosed as attach.
Comment 1 Andrew Morton 2010-08-19 22:46:21 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 19 Aug 2010 10:01:14 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=16627
> 
>            Summary: sdhci shows backtrace while resuming after s2disk
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.35.2
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: MMC/SD
>         AssignedTo: drivers_mmc-sd@kernel-bugs.osdl.org
>         ReportedBy: pfactum@gmail.com
>         Regression: No
> 
> 
> Created an attachment (id=27506)
>  --> (https://bugzilla.kernel.org/attachment.cgi?id=27506)
> config, lspci and dmesg
> 
> The following strings appear in dmesg after resuming:
> 
> ===
> [  199.151739] Call Trace:
> [  199.151749]  [<c1073734>] ? __report_bad_irq+0x24/0x90
> [  199.151753]  [<c10738f0>] ? note_interrupt+0x150/0x190
> [  199.151757]  [<c10751f1>] ? move_native_irq+0x11/0x50
> [  199.151762]  [<c107413b>] ? handle_fasteoi_irq+0xab/0xd0
> [  199.151765]  [<c1074090>] ? handle_fasteoi_irq+0x0/0xd0
> [  199.151768]  <IRQ>  [<c1004657>] ? do_IRQ+0x47/0xc0
> [  199.151775]  [<c10030e9>] ? common_interrupt+0x29/0x30
> [  199.151780]  [<c103007b>] ? __sched_setscheduler+0x26b/0x400
> [  199.151786]  [<c122f581>] ? acpi_idle_enter_c1+0x9d/0xb2
> [  199.151792]  [<c135e586>] ? cpuidle_idle_call+0x76/0xe0
> [  199.151795]  [<c1001bef>] ? cpu_idle+0x3f/0x90
> [  199.151800]  [<c15ee8d7>] ? start_kernel+0x2e9/0x2ee
> [  199.151804]  [<c15ee42c>] ? unknown_bootoption+0x0/0x190
> [  199.151806] handlers:
> [  199.151808] [<f8bfac50>] (r852_irq+0x0/0x250 [r852])
> [  199.151820] [<c136a080>] (sdhci_irq+0x0/0x570)

Both r852_irq() and sdhci_irq() are on that IRQ so the interrupt could
have been for either one.


> ===
> ...
> ===
> [  208.256014] mmc0: Timeout waiting for hardware interrupt.
> [  208.256019] sdhci: ============== REGISTER DUMP ==============
> [  208.256025] sdhci: Sys addr: 0x00000000 | Version:  0x00000400
> [  208.256030] sdhci: Blk size: 0x00000000 | Blk cnt:  0x00000000
> [  208.256035] sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
> [  208.256040] sdhci: Present:  0x01f70000 | Host ctl: 0x00000001
> [  208.256044] sdhci: Power:    0x0000000f | Blk gap:  0x00000000
> [  208.256049] sdhci: Wake-up:  0x00000000 | Clock:    0x00004007
> [  208.256054] sdhci: Timeout:  0x00000000 | Int stat: 0x00000000
> [  208.256058] sdhci: Int enab: 0x00ff00c3 | Sig enab: 0x00ff00c3
> [  208.256063] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000001
> [  208.256068] sdhci: Caps:     0x00c02120 | Max curr: 0x00000040
> [  208.256070] sdhci: ===========================================
> ===

Although it does look more MMCish.  It may be a platform issue too, in
which case we'd need to talk with the ACPI guys.

> lspci, dmesg and config are enclosed as attach.

Thanks.
Comment 2 Maxim Levitsky 2010-08-20 07:28:09 UTC
On Thu, 2010-08-19 at 15:45 -0700, Andrew Morton wrote: 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Thu, 19 Aug 2010 10:01:14 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=16627
> > 
> >            Summary: sdhci shows backtrace while resuming after s2disk
> >            Product: Drivers
> >            Version: 2.5
> >     Kernel Version: 2.6.35.2
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: MMC/SD
> >         AssignedTo: drivers_mmc-sd@kernel-bugs.osdl.org
> >         ReportedBy: pfactum@gmail.com
> >         Regression: No
> > 
> > 
> > Created an attachment (id=27506)
> >  --> (https://bugzilla.kernel.org/attachment.cgi?id=27506)
> > config, lspci and dmesg
> > 
> > The following strings appear in dmesg after resuming:
> > 
> > ===
> > [  199.151739] Call Trace:
> > [  199.151749]  [<c1073734>] ? __report_bad_irq+0x24/0x90
> > [  199.151753]  [<c10738f0>] ? note_interrupt+0x150/0x190
> > [  199.151757]  [<c10751f1>] ? move_native_irq+0x11/0x50
> > [  199.151762]  [<c107413b>] ? handle_fasteoi_irq+0xab/0xd0
> > [  199.151765]  [<c1074090>] ? handle_fasteoi_irq+0x0/0xd0
> > [  199.151768]  <IRQ>  [<c1004657>] ? do_IRQ+0x47/0xc0
> > [  199.151775]  [<c10030e9>] ? common_interrupt+0x29/0x30
> > [  199.151780]  [<c103007b>] ? __sched_setscheduler+0x26b/0x400
> > [  199.151786]  [<c122f581>] ? acpi_idle_enter_c1+0x9d/0xb2
> > [  199.151792]  [<c135e586>] ? cpuidle_idle_call+0x76/0xe0
> > [  199.151795]  [<c1001bef>] ? cpu_idle+0x3f/0x90
> > [  199.151800]  [<c15ee8d7>] ? start_kernel+0x2e9/0x2ee
> > [  199.151804]  [<c15ee42c>] ? unknown_bootoption+0x0/0x190
> > [  199.151806] handlers:
> > [  199.151808] [<f8bfac50>] (r852_irq+0x0/0x250 [r852])
> > [  199.151820] [<c136a080>] (sdhci_irq+0x0/0x570)
> 
> Both r852_irq() and sdhci_irq() are on that IRQ so the interrupt could
> have been for either one.
> 
> 
> > ===
> > ...
> > ===
> > [  208.256014] mmc0: Timeout waiting for hardware interrupt.
> > [  208.256019] sdhci: ============== REGISTER DUMP ==============
> > [  208.256025] sdhci: Sys addr: 0x00000000 | Version:  0x00000400
> > [  208.256030] sdhci: Blk size: 0x00000000 | Blk cnt:  0x00000000
> > [  208.256035] sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
> > [  208.256040] sdhci: Present:  0x01f70000 | Host ctl: 0x00000001
> > [  208.256044] sdhci: Power:    0x0000000f | Blk gap:  0x00000000
> > [  208.256049] sdhci: Wake-up:  0x00000000 | Clock:    0x00004007
> > [  208.256054] sdhci: Timeout:  0x00000000 | Int stat: 0x00000000
> > [  208.256058] sdhci: Int enab: 0x00ff00c3 | Sig enab: 0x00ff00c3
> > [  208.256063] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000001
> > [  208.256068] sdhci: Caps:     0x00c02120 | Max curr: 0x00000040
> > [  208.256070] sdhci: ===========================================
> > ===

This sounds a lot like what I had with CONFIG_MMC_RICOH_MMC enabled.

This doesn't happen very often I guess, right?
Also I guess that you loose SDHCI completely and if you reload it
(sudo modprobe -r sdhci-pci && sudo modprobe sdhci-pci), it won't load
telling you in kernel log that 'Hardware doesn't support any voltages'

If you can recompile kernel, try to disable CONFIG_MMC_RICOH_MMC
(This will make you loose the mmc support, but if you get 2.6.36-rc1,
you will get mmc support anyway via the non standard mmc controller,
which turned out to be almost standard SDHCI controller.)

Btw, I have written full support for memstick portion as well,
I have just posted patches on LKML, and I have a bit outdated version on
launchpad.
Therefore this is the first fully supported by Linux card reader.


Best regards,
Maxim Levitsky
Comment 3 Oleksandr Natalenko 2010-08-20 09:16:57 UTC
> This doesn't happen very often I guess, right?

It happens on every resume.

> If you can recompile kernel, try to disable CONFIG_MMC_RICOH_MMC
> (This will make you loose the mmc support, but if you get 2.6.36-rc1,
> you will get mmc support anyway via the non standard mmc controller,
> which turned out to be almost standard SDHCI controller.)

Yes, disabling CONFIG_MMC_RICOH_MMC resolves the problem. It seems,
it's buggy enough.
Comment 4 Maxim Levitsky 2010-08-20 09:33:41 UTC
Then just switch to 2.6.36-rc1 or apply this:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ccc92c23240cdf952ef7cc39ba563910dcbc9cbe

I also strongly suggest this, because otherwise if you forget mmc/sd card in the slot system will hang for sure on suspend:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4c2ef25fe0b847d2ae818f74758ddb0be1c27d8e


And this minor build fix for above patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=81ca03a0e2ea0207b2df80e0edcf4c775c07a505
Comment 5 Oleksandr Natalenko 2010-08-20 09:40:38 UTC
> Then just switch to 2.6.36-rc1 or apply this:

Well, OK. As this problem is fixed in future 2.6.36, I'll just wait
for release. Thanks for help.
Comment 6 Oleksandr Natalenko 2010-08-20 10:38:52 UTC
2.6.36-rc1-git2 with CONFIG_MMC_RICOH_MMC=y seems to work well. So,
should I leave this option enabled?
Comment 7 Maxim Levitsky 2010-08-20 10:47:22 UTC
You can try, but it should give the same problem sooner or later.
Nothing significant changed, so probably timing helps you.

Anyway, can you show me lspci from 2.6.36-rc1-git2 ater you did a cold boot and few suspend/resume cycles (one is enough)
Comment 8 Oleksandr Natalenko 2010-08-20 12:19:00 UTC
> You can try, but it should give the same problem sooner or later.
> Nothing significant changed, so probably timing helps you.

If I disable CONFIG_MMC_RICOH_MMC in 2.6.36-rc1, shall I get the same
functionality of my card reader?

> Anyway, can you show me lspci from 2.6.36-rc1-git2 ater you did a cold boot
> and
> few suspend/resume cycles (one is enough)

Sure, you may grab it from attach, but there's no difference between outputs.
Comment 9 Oleksandr Natalenko 2010-08-20 12:19:20 UTC
Attach is here.
Comment 10 Oleksandr Natalenko 2010-08-20 12:20:24 UTC
Created attachment 27519 [details]
lspci before and after resume
Comment 11 Maxim Levitsky 2010-08-20 12:23:46 UTC
>If I disable CONFIG_MMC_RICOH_MMC in 2.6.36-rc1, shall I get the same
>functionality of my card reader?

Yes.

Note You need to log in before you can comment on or make changes to this bug.