Bug 7014 - rtc doesn't generate updates on T60p
Summary: rtc doesn't generate updates on T60p
Status: CLOSED CODE_FIX
Alias: None
Product: Timers
Classification: Unclassified
Component: Realtime Clock (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: timers_realtime-clock
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-08-16 12:14 UTC by Jerry Quinn
Modified: 2008-09-22 16:56 UTC (History)
7 users (show)

See Also:
Kernel Version: 2.6.17.8
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
kernel dmesg output (21.68 KB, application/octet-stream)
2007-07-26 14:20 UTC, Jerry Quinn
Details
Kernel config (78.84 KB, application/octet-stream)
2007-11-17 09:43 UTC, Jerry Quinn
Details

Description Jerry Quinn 2006-08-16 12:14:21 UTC
Most recent kernel where this bug did not occur:
Distribution: Debian testing
Hardware Environment: IBM T60p
Software Environment:
Problem Description:

Trying to read from /dev/rtc blocks indefinitely.  If I use the simple test
program in rtc.txt.gz, it hangs on the read and never returns.  The program
appears to turn on update interrupts, before reading.

I see the same problem in Debian's latest 2.6.16 and a vanilla 2.6.18-rc4 kernel.

Steps to reproduce:
fd = open("/dev/rtc", O_RDONLY);
ioctl(fd, RTC_UIE_ON, 0);
read(fd, &data, sizeof(unsigned long));
Comment 1 Thomas Gleixner 2007-03-26 23:26:07 UTC
Is this still happening with latest mainline ?

    tglx
Comment 2 Jerry Quinn 2007-03-28 07:56:01 UTC
It does still happen with 2.6.18-4 as packaged by debian.  I will have to test
the latest kernel separately.
Comment 3 Natalie Protasevich 2007-07-07 13:24:45 UTC
Jerry, did you have chance to test, is the problem still there?
Thanks.
Comment 4 Jerry Quinn 2007-07-08 11:45:52 UTC
Yes, I still see it in Debian's 2.6.21-1-686
Comment 5 Andrew Morton 2007-07-25 16:36:08 UTC
David wonders "Does it happen with rtc-cmos, or only with the legacy driver?"

and adds

"If the general issue were that RTC irqs aren't arriving, I'd
suspect it's HPET interfering..."
Comment 6 Jerry Quinn 2007-07-26 14:20:33 UTC
Created attachment 12166 [details]
kernel dmesg output
Comment 7 Jerry Quinn 2007-07-26 14:20:47 UTC
Does the following answer which rtc driver this is?

naga:/usr/src/madwifi-svn/tools# modinfo rtc
filename:       /lib/modules/2.6.21-2-686/kernel/drivers/char/rtc.ko
alias:          char-major-10-135
license:        GPL
author:         Paul Gortmaker
depends:        
vermagic:       2.6.21-2-686 SMP mod_unload 686 

If not, let me know how to find the info.  I'm also attaching my dmesg output.
Comment 8 Anonymous Emailer 2007-07-26 16:56:40 UTC
Reply-To: david-b@pacbell.net

On Thursday 26 July 2007, bugme-daemon@bugzilla.kernel.org wrote:
> Does the following answer which rtc driver this is?

That was never an issue.  My question was whether this ONLY
showed up with the legacy driver (i.e. the one you were very
clearly using).

To try rtc-cmos, disable the char/rtc.c in Kconfig, and then
go to the RTC framework menu.  Enable all the interfaces
there, and "rtc-cmos".  Now try.
Comment 9 Anonymous Emailer 2007-07-26 17:08:52 UTC
Reply-To: david-b@pacbell.net

> ACPI: HPET id: 0x8086a201 base: 0xfed00000

OK, I was right.  This uses HPET.  Until a bunch of clock patches
merge, HPET prevents the CMOS clock from delivering IRQs.

Your workaround:  don't use HPET.

The short version of the story is that HPET has two modes, which
I tend to call "sane" and "broken".  Unfortunately, "broken" is the
default ... the breakage involves preventing the RTC (and something
else, maybe PIT) from working properly, by taking over their IRQs.

The patches I'm thinking of switch HPET over to use "sane" mode,
with IRQs routed normally (not clobbering other devices).  And
then, conveniently enough, allocate one HPET to each CPU so they
can serve as per-CPU clockevent sources.  I understand that those
patches have been deferred for now, along with all other x86_64
clockevent patches.
Comment 10 Andrew Morton 2007-07-26 17:33:20 UTC
(added Thomas to cc)

Thanks, David.  You know everything.

Jerry, if you're keen you could test 2.6.22-rc6-mm1
which has the patches which David refers to.

Otherwise, please wait until we get x86_64 dynticks/clockevents
back in, which will probably be a few weeks from now.

Thanks.
Comment 11 Anonymous Emailer 2007-07-26 18:44:52 UTC
Reply-To: david-b@pacbell.net

On Thursday 26 July 2007, you wrote:
> Thanks, David. 
Comment 12 Andrew Morton 2007-10-04 13:47:42 UTC
Guys, what's the status of this one?

AFACIT we expect that the x86_64 clockevents patches will fix this
bug, but Jerry has disappeared on us?

David, if Jerry is still working on this, it might be useful
to recap exactly what tests you'd like him to perform - it
isn't terribly clear...


Thanks.
Comment 13 Jerry Quinn 2007-10-04 13:52:11 UTC
bugme-daemon@bugzilla.kernel.org wrote:
> ------- Comment #12 from akpm@osdl.org  2007-10-04 13:47 -------
> Guys, what's the status of this one?
>
> AFACIT we expect that the x86_64 clockevents patches will fix this
> bug, but Jerry has disappeared on us?
>   
Sorry, I'm here.
> David, if Jerry is still working on this, it might be useful
> to recap exactly what tests you'd like him to perform - it
> isn't terribly clear...
>   
That would be useful.  Thanks.
Comment 14 Thomas Gleixner 2007-10-05 05:33:17 UTC
That's strange. HPET should do this ugly emulate RTC thing.

Can you please put your .config into the bugzilla ?

	tglx
Comment 15 David Brownell 2007-10-09 18:52:52 UTC
I thought Comment #8 described it completely.  Kernel config should have ONLY the rtc-cmos driver (not the legacy driver) ... that'll come out as /dev/rtc0 on most systems, which can be symlinked as /dev/rtc if you like.  Verify that the same problem shows up.

The legacy driver "should", as Thomas said, do the ugly thing given HPET in what I called "broken" mode.  The new one does NOT, but I expect it will act fine in "sane" mode when all the x86_64 clockevent and HPET patches merge.  (I don't recall hearing when that merge is planned.  One hopes it's soon...)
Comment 16 Thomas Gleixner 2007-11-14 14:40:55 UTC
Jerry, is this still happening with 2.6.24-rc2 ?

     tglx
Comment 17 Jerry Quinn 2007-11-17 09:42:48 UTC
Yes, I'm still seeing the same trouble.  I'm attaching the config I used for building this kernel.
Comment 18 Jerry Quinn 2007-11-17 09:43:34 UTC
Created attachment 13589 [details]
Kernel config
Comment 19 Thomas Gleixner 2007-11-17 09:48:23 UTC
Hmm, it builds both the legacy rtc and the new rtc driver.

David, is there any conflict ?
Comment 20 David Brownell 2007-11-17 10:14:25 UTC
There "should" be no confusion any more, but only one of the two RTC drivers will bind to the hardware.  Both are modular, so there is at least a simple choice of which to use, via "modprobe".  (Though you may want to make sure you've got the latest version of hwclock, from util-linux-ng, since older ones don't understand that they may need to use "/dev/rtc0" not always "/dev/rtc".)

I observe HPET still isn't completly disabled in this config, and since it's still being used in "broken" steal-the-rtc-IRQs mode, I'm thiking that will most likely still be making this trouble.

Until HPET is used only in "sane" mode, there should probably be a way to make sure it's never enabled ... otherwise these "my IRQ's been stolen!" failures will persist.
Comment 21 Ingo Molnar 2007-11-18 07:46:15 UTC
David, what would you suggest us to do to get the mainline kernel work fine out of box?
Comment 22 David Brownell 2007-11-18 15:10:21 UTC
Long term, stop using HPET in "legacy replacement"/broken mode and use it in "standard"/sane mode.  And get rid of that "ugly emulation" thing, and its support in various places.  There's really no point to trying to use that, except possibly as a (nasty) workaround for hardware bugs in "standard" mode.  (Only "standard" mode is guarantee to exist, too...)  I suspect that fix could not be for 2.6.24 ...

ISTR some other RTC/IRQ bug report that was similar, except that it said x86_32 worked while x86_64 didn't ... there may be some merge issues yet to resolve.  I noticed that x86_64 treats HPET differently.  (Not just in terms of init, but also arch/x86/Kconfig and HPET_TIMER...)

Near term there may be some ways to tweak the RTC code to coexist better with HPET.  "rtc-cmos" doesn't insist on having an IRQ ... so if PNP correctly reported that it doesn't have an IRQ, it should behave.  The legacy RTC driver is (as usual) a mess, but evidently it *used* to work (for some definition of "work") and something broke it.  Someone with HPET skillz should be able to sort this out ... at least the regression aspect of this bug should be fixable, although the botch of not using "sane" HPET mode will still remain.

(Remember that my insights here are restricted to RTC framework and code ... I've touched neither the legacy driver with which this problem is appearing, nor the x86 arch code that seems to be troublesome here.  So I can't help resolve this bug except by noting what must be the root cause: using this "broken" HPET mode setting up a fragile house-of-cards, which this bug reports as starting to fall down.)
Comment 23 Thomas Gleixner 2007-11-18 17:31:47 UTC
> ------- Comment #22 from dbrownell@users.sourceforge.net  2007-11-18 15:10
> -------
> Long term, stop using HPET in "legacy replacement"/broken mode and use it in
> "standard"/sane mode.  And get rid of that "ugly emulation" thing, and its
> support in various places.  There's really no point to trying to use that,
> except possibly as a (nasty) workaround for hardware bugs in "standard" mode. 
> (Only "standard" mode is guarantee to exist, too...)  I suspect that fix
> could
> not be for 2.6.24 ...

We can not use the "non legacy mode" in a sane way as long as BIOSes
are not providing irq routing for the non legacy case. Venki tried to
enforce this, but it is really troublesome.
 
> ISTR some other RTC/IRQ bug report that was similar, except that it said
> x86_32
> worked while x86_64 didn't ... there may be some merge issues yet to resolve. 
> I noticed that x86_64 treats HPET differently.  (Not just in terms of init,
> but
> also arch/x86/Kconfig and HPET_TIMER...)

the 32/64 bit hpet related code is the same now.

> Near term there may be some ways to tweak the RTC code to coexist better with
> HPET.  "rtc-cmos" doesn't insist on having an IRQ ... so if PNP correctly
> reported that it doesn't have an IRQ, it should behave.  The legacy RTC
> driver
> is (as usual) a mess, but evidently it *used* to work (for some definition of
> "work") and something broke it.  Someone with HPET skillz should be able to
> sort this out ... at least the regression aspect of this bug should be
> fixable,
> although the botch of not using "sane" HPET mode will still remain.
> 
> (Remember that my insights here are restricted to RTC framework and code ...
> I've touched neither the legacy driver with which this problem is appearing,
> nor the x86 arch code that seems to be troublesome here. So I can't help
> resolve this bug except by noting what must be the root cause: using this
> "broken" HPET mode setting up a fragile house-of-cards, which this bug
> reports
> as starting to fall down.)

Very helpful :)

The only pitfall as far as I can tell is that the old x8664 code did
not enforce the RTC emulation when HPET was enabled, which is
stupid. the 32 bit code did. This is fixed in .24-rc.

    tglx
Comment 24 Anonymous Emailer 2007-11-28 00:22:45 UTC
Reply-To: david-b@pacbell.net


> ------- Comment #23 from tglx@linutronix.de  2007-11-18 17:31 -------
> 
> We can not use the "non legacy mode" in a sane way as long as BIOSes
> are not providing irq routing for the non legacy case. Venki tried to
> enforce this, but it is really troublesome.

Well that's rude.  Gotta love the extent to which BIOS vendors
cripple the hardware.  :(

Next time I get a new PC, I'll hope it has an HPET so I can see
how this affects the new RTC framework (and the rtc-cmos driver).
Comment 25 Natalie Protasevich 2008-03-29 21:08:45 UTC
Jerry, have you tested current kernel, does the rtc work for you now?
Comment 26 Jerry Quinn 2008-03-31 08:49:25 UTC
I still see the problem in Debian kernel 2.6.24-1-686 v 2.6.24-5.  I can give a vanilla kernel a try too.
Comment 27 Anonymous Emailer 2008-04-02 12:39:59 UTC
Reply-To: david-b@pacbell.net

On Monday 31 March 2008, bugme-daemon@bugzilla.kernel.org wrote:
> I still see the problem in Debian kernel 2.6.24-1-686 v 2.6.24-5. 
Comment 28 Jerry Quinn 2008-05-19 16:24:21 UTC
The simple test program works on Debian's 2.6.25
Comment 29 Anonymous Emailer 2008-05-19 18:02:09 UTC
Reply-To: david-b@pacbell.net

On Monday 19 May 2008, you wrote:
> ------- Comment #28 from jlquinn@optonline.net 
Comment 30 Jerry Quinn 2008-05-19 19:35:09 UTC
(In reply to comment #29)
> Reply-To: david-b@pacbell.net
> 
> On Monday 19 May 2008, you wrote:
> > ------- Comment #28 from jlquinn@optonline.net �2008-05-19 16:24 -------
> > The simple test program works on Debian's 2.6.25
> 
> So ... the bug is fixed in 2.6.25?
> Or is this a Debian-specific patch/config issue?

I see nothing in Debian's kernel package changelog to indicate they apply a patch for the problem, so I'd conclude it is fixed in 2.6.25.

Thanks

Note You need to log in before you can comment on or make changes to this bug.