Bug 11051

Summary: rtc_cmos: Periodic timer doesn't work after using /dev/rtc0 once and then doing s2ram
Product: Drivers Reporter: Tomas Janousek (tomi)
Component: OtherAssignee: David Brownell (dbrownell)
Status: RESOLVED CODE_FIX    
Severity: low CC: bernhard.walle
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.26, 2.6.26 Subsystem:
Regression: --- Bisected commit-id:
Attachments: dmesg
nonworking .config
working .config
update rtc-cmos to use HPET glue more consistently
updated hpet-glue patch

Description Tomas Janousek 2008-07-07 05:51:20 UTC
Distribution: Debian
Hardware Environment: HP Compaq nx7300
Problem Description:
As summary suggests, after using the rtc as a periodic timer once and then suspending to ram, the periodic timer ceases to work -- mplayer just hangs waiting for data on the rtc fd. aireplay-ng has the same behvaiour.

Steps to reproduce:
1. mplayer -rtc -rtc-device /dev/rtc0 file.avi
2. s2ram
3. mplayer -rtc -rtc-device /dev/rtc0 file.avi

/proc/driver/rtc looks like this regardless of whether mplayer is running or
not:
rtc_time        : 13:34:36
rtc_date        : 2008-07-07
alrm_time       : 15:02:56
alrm_date       : ****-**-**
alarm_IRQ       : no
alrm_pending    : no
24hr            : yes
periodic_IRQ    : no
update_IRQ      : no
HPET_emulated   : yes
DST_enable      : no
periodic_freq   : 1024
batt_status     : okay

Having it as a module and doing rmmod/modprobe makes it work again, so that's
a workaround.

I'm attaching a dmesg with a few suspends and rmmod/modprobes, but there are
no messages about the breakage -- just for the details of my machine and
configuration.

This did work with 2.6.25 and the old RTC code which is now removed. For
details, see the attached .configs.
Comment 1 Tomas Janousek 2008-07-07 05:52:23 UTC
Created attachment 16761 [details]
dmesg
Comment 2 Tomas Janousek 2008-07-07 05:52:47 UTC
Created attachment 16762 [details]
nonworking .config
Comment 3 Tomas Janousek 2008-07-07 05:53:01 UTC
Created attachment 16763 [details]
working .config
Comment 4 David Brownell 2008-07-12 15:31:24 UTC
Hmm, using "rtctest" (from Documentation/rtc.txt) on a machine which also has the HPET emulation going on, I was unable to reproduce this.  Could you try reproducing it using that test program?

Looking at the dmesg outputs, nothing looked particularly odd except the "rtc: lost 1 interrupts" message.  That came from arch/x86/kernel/hpet.c (hence the text is misleading:  it should maybe say "hpet: lost 1 rtc interrupt") where hpet_rtc_timer_reinit() got called ... it looks like the problem is coupled to the HPET emulation having goofed up the state of the IRQ flags.

Someone who knows the HPET stuff should look at this.  I observe that neither the cmos_irq_set_state() nor cmos_suspend()/cmos_resume() methods were ever taught anything about HPET; maybe that's the root cause here.

The fact that cmos_procfs() wasn't taught about it explains why the /proc/driver/rtc file never reports the right value for the periodic_IRQ or update_IRQ fields; easily observed while runing "rtctest".  The HPET glue seems to be missing a way to query the emulated IRQ flags.

Another HPET-unaware routine is cmos_do_shutdown().
Comment 5 David Brownell 2008-07-12 21:56:59 UTC
Created attachment 16800 [details]
update rtc-cmos to use HPET glue more consistently

I'll be trying this patch in a while; as its description says, it basically tightens up the handling of the emulated RTC stuff.  There were several small holes in that handling, some of which might have caused problems like this.
Comment 6 Tomas Janousek 2008-07-13 04:56:46 UTC
Well, this patch fixed the problem for me, thanks.

Btw, as it corrected the output of /proc/driver/rtc, I noticed that after running mplayer once, the periodic_IRQ stays up forever (even after resume). Maybe that's the reason for those lost interrupts?
Comment 7 David Brownell 2008-07-15 16:07:55 UTC
Created attachment 16829 [details]
updated hpet-glue patch

this version passed sanity testing on one machine with hpet
Comment 8 Tomas Janousek 2008-07-15 18:05:10 UTC
Yeah, this new patch works as well.

I digged into the problem I mentioned in comment #6 a bit and found that the old rtc driver did turn off all irqs in the rtc_release function, but neither rtc-cmos nor rtc-dev does so. The UIE emulation in rtc-dev does turn off UIE in the release function and so other drivers like rtc-sh or rtc-bfin also turn off their interrupts.

So, I guess we should add a release op to rtc-cmos and clear the interrupts there. Now, if the application does not explicitly set PIE off, the interrupts keep firing (verified with powertop) forever, even after resume.
Comment 9 David Brownell 2008-07-16 10:39:42 UTC
Thanks for the testing, I'll probably submit this today.

The "leave IRQs enabled" bug is a separate issue, which I think should be addressed within the RTC framework ... drivers shouldn't need to work around that stuff.
Comment 10 Tomas Janousek 2008-07-16 11:49:10 UTC
Ok, fine. I'll open a bug for that and possibly try to code some patches.
Comment 11 David Brownell 2008-07-17 16:24:17 UTC
patch sent to lkml and merged into MM for 2.6.27 ... it's not quite either version attached to this bugreport.  Candidate for stable series.