Subject : BUG: rtc_cmos (2.6.34-rc7) Submitter : Randy Dunlap <randy.dunlap@oracle.com> Date : 2010-05-10 23:09 Message-ID : 4BE89243.8090809@oracle.com References : http://marc.info/?l=linux-kernel&m=127353313728385&w=2 This entry is being used for tracking a regression from 2.6.33. Please don't close it until the problem is fixed in the mainline.
I have now hit this BUG about 6 times. It does not always happen, but I have been pushing it a lot. Somehow the struct cmos_rtc *cmos in cmos_update_irq_enable() is NULL (returned from dev_get_drvdata()). I have the most success in causing the BUG to happen after I load and unload several PCMCIA drivers, then load/unload rtc-cmos module multiple times.
Could hwclock be doing something odd? In all of my kernel logs with this BUG, I see this: Pid: 3787, comm: hwclock Not tainted 2.6.34-rc7 #2 0HH807/OptiPlex GX620
Created attachment 26376 [details] move setting driver data before rtc_device_register() Hi Randy, I think this patch will take care of the problem. Can you give it a whirl?
Hi Dan, Thanks for the patch... but it consistently bugs during initialization: [ 18.063430] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 18.064183] IP: [<ffffffffa0243c4d>] cmos_do_probe+0x1e6/0x638 [rtc_cmos] [ 18.064183] PGD 7c6e7067 PUD 7c6cb067 PMD 0 [ 18.064183] Oops: 0000 [#1] SMP [ 18.064183] last sysfs file: /sys/class/scsi_generic/sg0/dev [ 18.064183] CPU 1 [ 18.064183] Modules linked in: rtc_cmos(+) psmouse serio_raw pcspkr sg rtc_core i2c_i801 rng_core rtc_lib parport button intel_agp thermal processor thermal_sys hwmon sr_mod cdrom ata_generic pata_acpi ata_piix libata ide_pci_generic ide_core sd_mod scsi_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ssb mmc_core pcmcia pcmcia_core ehci_hcd usbcore nls_base [last unloaded: scsi_wait_scan] [ 18.064183] [ 18.064183] Pid: 2061, comm: modprobe Not tainted 2.6.34-rc7 #3 0HH807/OptiPlex GX620 [ 18.064183] RIP: 0010:[<ffffffffa0243c4d>] [<ffffffffa0243c4d>] cmos_do_probe+0x1e6/0x638 [rtc_cmos] [ 18.064183] RSP: 0018:ffff880079c35d48 EFLAGS: 00010202 [ 18.064183] RAX: 0000000000000000 RBX: ffff88007e845b18 RCX: ffffffffa0245a70 [ 18.064183] RDX: ffffffffa0245490 RSI: ffff88007e845b18 RDI: ffffffffa0245480 [ 18.064183] RBP: ffff880079c35d78 R08: ffffffff81830d38 R09: ffffffff810573ae [ 18.064183] R10: ffffffffa0245480 R11: ffffffff81830ca0 R12: ffffffffa0245ce0 [ 18.064183] R13: ffff88007e383440 R14: 0000000000000008 R15: 0000000000000100 [ 18.064183] FS: 00007f94d15036f0(0000) GS:ffff880005400000(0000) knlGS:0000000000000000 [ 18.064183] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 18.064183] CR2: 0000000000000010 CR3: 0000000079c09000 CR4: 00000000000006e0 [ 18.064183] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 18.064183] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 18.064183] Process modprobe (pid: 2061, threadinfo ffff880079c34000, task ffff88007c720000) [ 18.064183] Stack: [ 18.064183] ffff880079c35d78 ffff88007e845b18 0000000000000008 ffffffffa0245780 [ 18.064183] <0> ffffffffa0245740 0000000000000000 ffff880079c35d98 ffffffffa02450aa [ 18.064183] <0> ffff88007e845b18 ffffffffa0245500 ffff880079c35dd8 ffffffff8126d670 [ 18.064183] Call Trace: [ 18.064183] [<ffffffffa02450aa>] cmos_pnp_probe+0x108/0x114 [rtc_cmos] [ 18.064183] [<ffffffff8126d670>] pnp_device_probe+0x115/0x159 [ 18.064183] [<ffffffff812ccd73>] ? driver_sysfs_add+0x5c/0x96 [ 18.064183] [<ffffffff812cd077>] driver_probe_device+0x1b7/0x334 [ 18.064183] [<ffffffff812cd27c>] __driver_attach+0x88/0xc0 [ 18.064183] [<ffffffff812cd1f4>] ? __driver_attach+0x0/0xc0 [ 18.064183] [<ffffffff812cbed2>] bus_for_each_dev+0x7e/0xd6 [ 18.064183] [<ffffffff812ccc8e>] driver_attach+0x20/0x29 [ 18.064183] [<ffffffff812cc690>] bus_add_driver+0x147/0x361 [ 18.064183] [<ffffffff812cd6e3>] driver_register+0xf3/0x1ad [ 18.064183] [<ffffffffa01c7000>] ? cmos_init+0x0/0xad [rtc_cmos] [ 18.064183] [<ffffffff8126d2db>] pnp_register_driver+0x23/0x2c [ 18.064183] [<ffffffffa01c701b>] cmos_init+0x1b/0xad [rtc_cmos] [ 18.064183] [<ffffffff8100023a>] do_one_initcall+0x75/0x1cb [ 18.064183] [<ffffffff810920bb>] sys_init_module+0x134/0x326 [ 18.064183] [<ffffffff810034eb>] system_call_fastpath+0x16/0x1b [ 18.064183] Code: e8 c9 8c 08 e1 48 8b 05 72 20 00 00 48 ff 05 53 2b 00 00 48 c7 c1 70 5a 24 a0 48 c7 c2 90 54 24 a0 48 89 de 48 c7 c7 80 54 24 a0 <48> 8b 40 10 49 89 45 10 e8 bc e4 f1 ff 48 3d 00 f0 ff ff 48 89 [ 18.064183] RIP [<ffffffffa0243c4d>] cmos_do_probe+0x1e6/0x638 [rtc_cmos] [ 18.064183] RSP <ffff880079c35d48> [ 18.064183] CR2: 0000000000000010 [ 18.395492] ---[ end trace c91b161e66911807 ]---
Created attachment 26390 [details] move setting driver data before rtc_device_register() - version 2. Aish... Randy, I'm embarrassed. That was an avoidable bug. It won't happen again. Sorry for that. I can't reproduce your original bug on my system with an unmodified kernel. I put the system in loop where it did an rmmod and a modprobe 10000 times and but I didn't see original bug. But I still think setting the driver data earlier _should_ fix it. Could you test version 2?
Handled-By : Dan Carpenter <error27@gmail.com> Patch : https://bugzilla.kernel.org/attachment.cgi?id=26390
Hi Dan, Patch #2 seems to have fixed the bug that I was seeing. Thanks. Tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Fixed by commit 6ba8bcd457d9fc793ac9435aa2e4138f571d4ec5 .