Bug 40672

Summary: screen flickers, touchpad unreliable, kernel crashes; unless acpi=off - Lenovo Ideapad U455, AMD Athlon(tm) Neo Processor MV-40
Product: ACPI Reporter: Igor Murzov (e-mail)
Component: Power-VideoAssignee: acpi_power-battery
Status: CLOSED CODE_FIX    
Severity: normal CC: acpi-bugzilla, florian, lenb, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.4.0-rc5 Subsystem:
Regression: No Bisected commit-id:
Attachments: acpidump output
lshw output
/var/log/dmesg
/var/log/syslog
kernel config
output of Linux-ready Firmware Developer Kit
Screenshot of the kernel panic
Screenshot of the kernel panic 2
dmesg output when touchpad was not working
dmesg output when touchpad worked fine
Screenshot of the kernel panic #3
full dmesg output with warning in it
Disassembled DSDT
dmesg output when display flickering was observed

Description Igor Murzov 2011-08-07 15:56:46 UTC
Created attachment 67872 [details]
acpidump output

Without any custom options my system boots, but it is unstable: screen sometimes flickers, garbage appears in framebuffer, touchpad not always works and sometimes kernel even crashes at boot time. With acpi=off everything boots and works stable without any flickering and without garbage in framebuffer. I've tried to boot with `acpi=ht`, but issues were still here.
Comment 1 Igor Murzov 2011-08-07 15:57:34 UTC
Created attachment 67882 [details]
lshw output
Comment 2 Igor Murzov 2011-08-07 15:58:57 UTC
Created attachment 67892 [details]
/var/log/dmesg

Nothing interesting here, I suppose.
Comment 3 Igor Murzov 2011-08-07 16:00:33 UTC
Created attachment 67902 [details]
/var/log/syslog

This one contains interesting call traces.
Comment 4 Igor Murzov 2011-08-07 20:13:52 UTC
Created attachment 67932 [details]
kernel config
Comment 5 Len Brown 2011-08-08 16:14:20 UTC
> With acpi=off everything boots
> and works stable without any flickering and without garbage in framebuffer.

> I've tried to boot with `acpi=ht`, but issues were still here.

"acpi=ht" is note a valid boot parameter in 3.0.1.

Please confirm that you still see the problems when
you boot with "maxcpus=1"

In a configuration that shows the issues, please show the output
from:

cat /proc/interrupts; sleep 10; cat /proc/interrupts

re: the backtraces

 WARNING: at drivers/base/driver.c:262 driver_unregister+0x8a/0xa0()
 Hardware name: 20046                           
 Unexpected driver unregister!
 Modules linked in: battery(-) snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ipv6 ext2 mbcache ppdev lp parport_pc parport fuse btusb bluetooth uvcvideo bcma videodev ohci_hcd v4l2_compat_ioctl32 radeon sp5100_tco ssb joydev ttm brcmsmac(C) drm_kms_helper brcmutil(C) drm snd_hda_codec_hdmi mmc_core snd_hda_codec_conexant agpgart snd_hda_intel snd_hda_codec ehci_hcd evdev i2c_piix4 rtc_cmos atl1c mac80211 psmouse cfg80211 serio_raw rfkill snd_hwdep snd_pcm snd_timer snd i2c_algo_bit i2c_core pcmcia pcmcia_core k8temp crc_ccitt soundcore hwmon sg shpchp snd_page_alloc loop btrfs [last unloaded: pcmcia_rsrc]
 Pid: 3607, comm: rmmod Tainted: G         C  3.0.1 #19
 Call Trace:
  [<ffffffff8105628f>] warn_slowpath_common+0x7f/0xc0
  [<ffffffff81056386>] warn_slowpath_fmt+0x46/0x50
  [<ffffffff8131029a>] driver_unregister+0x8a/0xa0
  [<ffffffff812979f8>] acpi_bus_unregister_driver+0x15/0x17
  [<ffffffffa009c37c>] acpi_battery_exit+0x10/0x1e [battery]
  [<ffffffff81092242>] sys_delete_module+0x192/0x290
  [<ffffffff814f1342>] system_call_fastpath+0x16/0x1b
 ---[ end trace 4a476ded370745e3 ]---

This one looks like driver_unregister() was called by
the battery driver with an illegal driver pointer.

There was a race condition recently fixed upstream
in the battery driver, it would be interesting if this
problem goes away when you run the upstream kernel.

But for now, in the interest of isolating the issue,
if you can blacklist the battery driver and still
reproduce the instability issues (or not),
that would be helpful.
eg. edit /etc/modprobe.d/blacklist.conf
on most distros

The other stack trace is in strlen, possibly
where something is touching procfs.

What happens if you follow the advice of dmesg
and build the kernel with th CONFIG_ACPI_PROCFS_POWER=n?
Comment 6 Igor Murzov 2011-08-08 18:04:40 UTC
> Please confirm that you still see the problems when
> you boot with "maxcpus=1"

Confirmed. Issues are still here with "maxcpus=1".

> In a configuration that shows the issues, 

Following output obtained with
Command line: BOOT_IMAGE=/vmlinuz root=/dev/sda2 maxcpus=1 ro

> please show the output from:
> cat /proc/interrupts; sleep 10; cat /proc/interrupts

            CPU0       
   0:      37013   IO-APIC-edge      timer
   1:        637   IO-APIC-edge      i8042
   7:          1   IO-APIC-edge    
   8:        114   IO-APIC-edge      rtc0
   9:       9878   IO-APIC-fasteoi   acpi
  12:      30754   IO-APIC-edge      i8042
  14:          0   IO-APIC-edge      pata_atiixp
  15:          0   IO-APIC-edge      pata_atiixp
  16:       1304   IO-APIC-fasteoi   hda_intel, ohci_hcd:usb3, ohci_hcd:usb4
  17:        324   IO-APIC-fasteoi   ehci_hcd:usb1, brcmsmac
  18:        412   IO-APIC-fasteoi   ohci_hcd:usb5, radeon
  19:          3   IO-APIC-fasteoi   ehci_hcd:usb2
  44:      14108   PCI-MSI-edge      ahci
  45:         29   PCI-MSI-edge      hda_intel
  46:          3   PCI-MSI-edge      radeon
  47:        212   PCI-MSI-edge      eth0
 NMI:          0   Non-maskable interrupts
 LOC:      28940   Local timer interrupts
 SPU:          0   Spurious interrupts
 PMI:          0   Performance monitoring interrupts
 IWI:          0   IRQ work interrupts
 RES:          0   Rescheduling interrupts
 CAL:          0   Function call interrupts
 TLB:          0   TLB shootdowns
 TRM:          0   Thermal event interrupts
 THR:          0   Threshold APIC interrupts
 MCE:          0   Machine check exceptions
 MCP:          2   Machine check polls
 ERR:          1
 MIS:          0
            CPU0       
   0:      38007   IO-APIC-edge      timer
   1:        637   IO-APIC-edge      i8042
   7:          1   IO-APIC-edge    
   8:        114   IO-APIC-edge      rtc0
   9:       9900   IO-APIC-fasteoi   acpi
  12:      30754   IO-APIC-edge      i8042
  14:          0   IO-APIC-edge      pata_atiixp
  15:          0   IO-APIC-edge      pata_atiixp
  16:       1736   IO-APIC-fasteoi   hda_intel, ohci_hcd:usb3, ohci_hcd:usb4
  17:        335   IO-APIC-fasteoi   ehci_hcd:usb1, brcmsmac
  18:        412   IO-APIC-fasteoi   ohci_hcd:usb5, radeon
  19:          3   IO-APIC-fasteoi   ehci_hcd:usb2
  44:      14113   PCI-MSI-edge      ahci
  45:         29   PCI-MSI-edge      hda_intel
  46:          3   PCI-MSI-edge      radeon
  47:        230   PCI-MSI-edge      eth0
 NMI:          0   Non-maskable interrupts
 LOC:      29494   Local timer interrupts
 SPU:          0   Spurious interrupts
 PMI:          0   Performance monitoring interrupts
 IWI:          0   IRQ work interrupts
 RES:          0   Rescheduling interrupts
 CAL:          0   Function call interrupts
 TLB:          0   TLB shootdowns
 TRM:          0   Thermal event interrupts
 THR:          0   Threshold APIC interrupts
 MCE:          0   Machine check exceptions
 MCP:          2   Machine check polls
 ERR:          1
 MIS:          0

> There was a race condition recently fixed upstream
> in the battery driver, it would be interesting if this
> problem goes away when you run the upstream kernel.

> build the kernel with th CONFIG_ACPI_PROCFS_POWER=n

Will alter config and build kernel 3.1rc1 today.
Comment 7 Igor Murzov 2011-08-08 18:49:01 UTC
Created attachment 68112 [details]
output of Linux-ready Firmware Developer Kit

I have also tested the laptop with this tools: http://linuxfirmwarekit.org/download/firmwarekit-r3.iso
Screenshot: http://i.imgur.com/FSTCs.jpg
Comment 8 Len Brown 2011-08-08 19:02:08 UTC
Well, there are plenty of AMD quirks and workarounds evident in dmesg.
Unfortunately, I have no expertise with any of them.

/proc/interrupts doesn't show anything *thrashing* that would explain
your touchpad issues.  It does show that you get lots of clock ticks,
and I wonder if there is something odd in your .config, such as
missing CONFIG_NO_HZ=y  powertop would probably complain about such things.

The firmware-test-kit complains about the hpet.
You might consider disabling it to see if that helps.
hpet=disable
Comment 9 Igor Murzov 2011-08-08 20:49:08 UTC
Len, it seems that I confused you about the touchpad issue. I need to clarify a point. The touchpad actually works fine, but not after every system boot -- it works fine or doesn't work at all. Sometimes the system boots fine without any visible problems, but not every time. The most usual issues is a screen flickering and a garbage in the framebuffer.
Comment 10 Igor Murzov 2011-08-09 15:59:48 UTC
Created attachment 68232 [details]
Screenshot of the kernel panic

I've tried booting with 'hpet=disable' on the command line. Out of six times, four times the system worked fine without visible issues. One time kernel crashed at the boot time (photo is attached). And one time I saw a screen flickering and junk symbols in the framebuffer. So, this option doesn't helped :(
Comment 11 Igor Murzov 2011-08-09 16:02:46 UTC
Forgot to mention, that my previous comment is related to the same 3.0.1 kernel, not to 3.1rc1.
Comment 12 Igor Murzov 2011-08-14 15:20:14 UTC
The same crash as in the comment #10 occurs on kernel 3.1rc1+ too.
Comment 13 Igor Murzov 2011-08-29 00:44:22 UTC
Created attachment 70752 [details]
Screenshot of  the kernel panic 2

This panic happened when I turned screen on with Fn + F2. Before to this panic I locked KDE session, turned screen off and left laptop in this state for about 4 hours.
Comment 14 Zhang Rui 2012-01-18 05:31:28 UTC
It's great that the kernel bugzilla is back.

Can you please verify if the problem still exists in the latest upstream
kernel?
Comment 15 Igor Murzov 2012-01-22 16:35:39 UTC
I haven't seen screen flicker since upgrade to kernel 3.3rc1, but touchpad works not on every system boot.
Comment 16 Igor Murzov 2012-01-28 17:11:53 UTC
Here is the interesting part of the diff between dmesg output, when touchpad was not working, and dmesg output, when touchpad worked fine:
-------------------------------------------------
@@ -598,6 +598,7 @@
  ata3: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x9008 irq 15
  i8042: PNP: PS/2 Controller [PNP0303:KBC0,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
  serio: i8042 KBD port at 0x60,0x64 irq 1
+ serio: i8042 AUX port at 0x60,0x64 irq 12
  mousedev: PS/2 mouse device common for all mice
  md: linear personality registered for level -1
  md: raid0 personality registered for level 0
@@ -615,6 +616,7 @@
  Initializing XFRM netlink socket
  NET: Registered protocol family 17
  Registering the dns_resolver key type
+ input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
  ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
  ata1.00: ATA-8: HITACHI HTS545025B9A300, PB2ZC61H, max UDMA/100
  ata1.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
-------------------------------------------------
Comment 17 Igor Murzov 2012-01-28 17:13:31 UTC
Created attachment 72215 [details]
dmesg output when touchpad was not working
Comment 18 Igor Murzov 2012-01-28 17:14:32 UTC
Created attachment 72216 [details]
dmesg output when touchpad worked fine
Comment 19 Florian Mickler 2012-02-11 15:25:02 UTC
A patch referencing this bug report has been merged in Linux v3.3-rc3:

commit 82b982c9a697e7be0745523a53334fe38a4582c8
Author: Igor Murzov <intergalactic.anonymous@gmail.com>
Date:   Fri Feb 3 00:19:07 2012 -0800

    Input: i8042 - add Lenovo Ideapad U455 to 'reset' blacklist
Comment 20 Zhang Rui 2012-02-13 04:51:04 UTC
Igor,
can you please verify the patch in comment #19 works for you?
Comment 21 Igor Murzov 2012-02-14 17:39:21 UTC
The patch in comment #19 indeed fixes touchpad issues, but I still observe rare random crashes and screen flicker. Screen flicker seems to happen only after incorrect system shutdown.
Comment 22 Igor Murzov 2012-02-14 20:42:50 UTC
Created attachment 72382 [details]
Screenshot of the kernel panic #3

This kernel panic was observed with linux-3.3.0-rc2+.
Comment 23 Igor Murzov 2012-02-20 18:50:41 UTC
Created attachment 72449 [details]
full dmesg output with warning in it

I have recently found following warning in dmesg:

 ACPI Warning: _BQC returned an invalid level (20120111/video-472)

Looks like some bug in acpi table.
Comment 24 Igor Murzov 2012-02-20 18:59:37 UTC
Created attachment 72450 [details]
Disassembled DSDT

Here is the relevant part of DSDT:

 Device (LCD)
 {
     Name (_ADR, 0x0110)
     Name (_DCS, 0x1F)
     Name (_DGS, Zero)
     Name (BLLV, Zero)
     Method (_DSS, 1, NotSerialized)
     {
         S80H (0xD4)
         Sleep (0x14)
         And (_DCS, 0xFFFFFFFD, _DCS)
         And (Arg0, One, Local0)
         ShiftLeft (Local0, One, Local0)
         Or (_DCS, Local0, _DCS)
     }

     Name (BRLV, Package (0x0F)
     {
         0x50, 
         0x43, 
         0x14, 
         0x1B, 
         0x21, 
         0x28, 
         0x2F, 
         0x35, 
         0x3C, 
         0x43, 
         0x49, 
         0x50, 
         0x57, 
         0x5D, 
         0x64
     })
     Method (_BCL, 0, NotSerialized)
     {
         Return (Package (0x0D)
         {
             0x46, 
             0x28, 
             Zero, 
             0x0A, 
             0x14, 
             0x1E, 
             0x28, 
             0x32, 
             0x3C, 
             0x46, 
             0x50, 
             0x5A, 
             0x64
         })
     }

     Method (_BCM, 1, NotSerialized)
     {
         Store (0x35, P80H)
         Divide (Arg0, 0x0A, Local0, Local1)
         Store (Local1, ^^^^LPC0.EC0.BRTS)
     }

     Method (_BQC, 0, NotSerialized)
     {
         Multiply (^^^^LPC0.EC0.BRTS, 0x0A, Local0)
         Return (Local0)
     }
 }

This bug is probably related to https://bugzilla.kernel.org/show_bug.cgi?id=13121
Comment 25 Igor Murzov 2012-02-24 00:12:28 UTC
Created attachment 72472 [details]
dmesg output when display flickering was observed

Booting kernel with acpi.debug_level=0xF acpi.debug_layer=0x10C10006 revealed that on the early stage LCD_ receives spurious notifies like this:

[    1.220201]   evmisc-0120 [4294967294] ev_queue_notify_reques: Dispatching Notify on [LCD_] Node ffff8800af04e5c8 Value 0x87 (**Device Specific**)
[    1.220431]   evmisc-0196 [4294967294] ev_queue_notify_reques: No notify handler for Notify (LCD_, 87) node ffff8800af04e5c8
[    1.220666]   evmisc-0120 [4294967294] ev_queue_notify_reques: Dispatching Notify on [LCD_] Node ffff8800af04f050 Value 0x87 (**Device Specific**)
[    1.220891]   evmisc-0196 [4294967294] ev_queue_notify_reques: No notify handler for Notify (LCD_, 87) node ffff8800af04f050
[    1.221136]   evmisc-0120 [4294967294] ev_queue_notify_reques: Dispatching Notify on [VPC0] Node ffff8800af052140 Value 0x80 (**Device Specific**)
[    1.221361]   evmisc-0196 [4294967294] ev_queue_notify_reques: No notify handler for Notify (VPC0, 80) node ffff8800af052140

I'm pretty sure, that those notifies cause display flickering and crashes.

What is also interesting in the attached dmesg log is that *one* single Fn+Up or Fn+Down keypress produces *two* notifies, thus skipping some brightness levels:

[ 1917.142236]   evmisc-0120 [4294967289] ev_queue_notify_reques: Dispatching Notify on [LCD_] Node ffff8800af04e5c8 Value 0x86 (**Device Specific**)
[ 1917.142288]   evmisc-0120 [4294967289] ev_queue_notify_reques: Dispatching Notify on [LCD_] Node ffff8800af04f050 Value 0x86 (**Device Specific**)
[ 1917.142330]   evmisc-0120 [4294967289] ev_queue_notify_reques: Dispatching Notify on [VPC0] Node ffff8800af052140 Value 0x80 (**Device Specific**)
[ 1917.142351]   evmisc-0196 [4294967289] ev_queue_notify_reques: No notify handler for Notify (VPC0, 80) node ffff8800af052140
[ 1917.147153]    utils-0286 [4294967282] evaluate_integer      : Return value [40]
[ 1917.156349]    utils-0286 [4294967282] evaluate_integer      : Return value [50]
[ 1917.161988]    utils-0286 [4294967282] evaluate_integer      : Return value [50]
[ 1917.171846]    utils-0286 [4294967282] evaluate_integer      : Return value [60]

I had to add "blacklist video" to my /etc/modprobe.d/blacklist.conf to prevent crashes and flickering, but I'm not able to change display brightness now.
Comment 26 Florian Mickler 2012-04-04 15:02:36 UTC
A patch referencing this bug report has been merged in Linux v3.4-rc1:

commit b60e7f6166857c76871977794fa266b02da1f394
Author: Igor Murzov <intergalactic.anonymous@gmail.com>
Date:   Fri Mar 30 21:32:09 2012 +0400

    ACPI video: Don't start video device until its associated input device has been allocated
Comment 27 Zhang Rui 2012-05-24 08:04:30 UTC
Bug closed.
Comment 28 Igor Murzov 2012-05-24 12:45:30 UTC
I can't say that this bug is fixed. Touchpad works fine now and kernel doesn't crash. But video module still doesn't work for me. Screen still flickers from time to time, and also increasing or decreasing brightness skips one brightness level. This results in 6 brightness levels available in linux instead of 11 levels.

Should i open new bug report?
Comment 29 Igor Murzov 2012-05-29 00:08:08 UTC
I reopen this bug.
Comment 30 Zhang Rui 2012-11-26 02:51:49 UTC
I'd prefer we'd focus on one problem for each bug report.
As least the patch in comment #26 fixes the oops, right?
so I think we'd close this bug and file a new bug report for the screen flickers problem.
Comment 31 Igor Murzov 2012-11-26 19:40:07 UTC
Ok.