Bug 15257

Summary: drivers/input/misc/winbond-cir.c breaks suspend
Product: Drivers Reporter: David Härdeman (david)
Component: Input DevicesAssignee: David Härdeman (david)
Status: RESOLVED CODE_FIX    
Severity: normal CC: alan, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
URL: http://www.spinics.net/lists/linux-input/msg07041.html
Kernel Version: git checkout between 2.6.33-rc6 and 2.6.33-rc7 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 56331    
Attachments: debug log captured by netconsole during suspend/resume/suspend
decompiled dsdt

Description David Härdeman 2010-02-08 10:09:06 UTC
I've been trying to debug a problem with my driver
(drivers/input/misc/winbond-cir.c) and suspend/resume.

If the driver is loaded, my test computer (Intel DG45FC motherboard)
survives one suspend/resume cycle but hangs (no pings, no magic sysrq,
etc) on the second suspend attempt (tested on 2.6.32.X and currently
testing on a git checkout from yesterday).

After sprinkling a lot of printk's everywhere, it seems that the kernel
freezes either in the below call, or at a random time soon after it:

drivers/pnp/driver.c:pnp_bus_suspend(), calls
pnp_dev->protocol->suspend()

Since this is a pnpacpi device, that is a call to:

drivers/pnp/pnpacpi/core.c:pnpacpi_disable_resources(), which calls the
ACPI "_DIS" method for the "UAR3" object.

Now, if I add a hack to my driver to remove the PNP_DISABLE flag from
device->capabilities (which means, AFAIK, that the above ACPI "_DIS"
method won't get called during suspend), I can reliably suspend and
resume, which is why I believe that this has got something to do with
ACPI even though the crash during suspend doesn't always happen while
"my" drivers _DIS method is being executed.

I've tried debugging the ACPI method by doing:
echo _DIS   >> /sys/module/acpi/parameters/trace_method_name
echo enable >> /sys/module/acpi/parameters/trace_state

and done a suspend/resume/suspend (i.e. caused the system to crash)
while capturing the output via netconsole. I can't derive any
interesting insights from the log though.

The decompiled dsdt table and the log capture from netconsole is
attached....the first suspend starts at 203.176799 (the first line of
the log) and ends around 209.083233 (line 2576), the resume starts right
after the suspend and ends around 211.358430 (line 2712), the second
suspend starts at 342.876299 (line 2718) and ends where the log file
ends (with a freeze).

I'll add the log and dsdt to this bug report...
Comment 1 David Härdeman 2010-02-08 10:10:32 UTC
Created attachment 24949 [details]
debug log captured by netconsole during suspend/resume/suspend
Comment 2 David Härdeman 2010-02-08 10:11:50 UTC
Created attachment 24950 [details]
decompiled dsdt
Comment 3 Alan 2010-02-09 22:23:28 UTC
Sorry - semi automatic adding of maintainer backfired given who filed the bug itself
Comment 4 David Härdeman 2010-02-10 09:09:49 UTC
I'm just getting more and more confused in trying to debug this...if I insert this:

        if (1) {
                struct acpi_device *acpi_dev = device->data;
                acpi_handle handle = acpi_dev->handle;
                char prefix[80] = {'\0'};
                struct acpi_buffer buffer = {sizeof(prefix), prefix};

                acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer);
                printk("Calling _DIS  on %s (0x%p)\n", prefix, handle);

                acpi_dbg_level=0xFFFFFFFF;
                printk("_DIS first attempt\n");
                if (ACPI_FAILURE(acpi_evaluate_object(handle, "_DIS", NULL, NULL)))
                        printk("Calling _DIS once Failed\n");
                printk("_DIS second attempt\n");
                if (ACPI_FAILURE(acpi_evaluate_object(handle, "_DIS", NULL, NULL)))
                        printk("Calling _DIS twice Failed\n");
                acpi_dbg_level=0x3;
        }

at random places in my driver's probe method, it seems to work as long as it's placed before the request_irq() call, whenever it's after the request_irq() call, the machine hangs...

But looking at the "_DIS" method from the dsdt...all it does (as far as I can tell) is:

1) Acquire mutex
2) Set logical device number in SuperIO chip
3) Write a disable bit at 0x30
4) Release mutex

So why would the request_irq() call have any bearing on "_DIS"....???
Comment 5 David Härdeman 2010-02-10 09:28:09 UTC
Answer: Because calling _DIS immediately generates an IRQ with all status bits set...would the proper fix be to use disable_irq() / enable_irq() at the right places (suspend/resume) in the driver?
Comment 6 David Härdeman 2010-02-10 22:00:59 UTC
Patch sent to linux-input and linux-kernel
Comment 7 Zhang Rui 2010-02-21 08:53:55 UTC
please attach the url to this patch.
re-assign to linux-input category.
Comment 8 David Härdeman 2010-02-21 21:12:38 UTC
Patch URL added...
Comment 9 David Härdeman 2010-03-03 11:08:48 UTC
Fixed with commit 197d4db752e67160d79fed09968c2140376a80a3