Bug 11296 - 2.6.27-rc2-git4: suspend and power off fails on Asus M3A32-MVP
Summary: 2.6.27-rc2-git4: suspend and power off fails on Asus M3A32-MVP
Status: CLOSED CODE_FIX
Alias: None
Product: Power Management
Classification: Unclassified
Component: cpufreq (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: cpufreq
URL:
Keywords:
Depends on:
Blocks: 7216 Regressions-2.6.26
  Show dependency tree
 
Reported: 2008-08-09 15:03 UTC by Rafael J. Wysocki
Modified: 2008-08-22 11:21 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.27-rc2-git4
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
oops message (39.11 KB, text/plain)
2008-08-22 08:29 UTC, Mark Langsdorf
Details
another crash report (40.23 KB, text/plain)
2008-08-22 08:30 UTC, Mark Langsdorf
Details
yet another crash report (39.97 KB, text/plain)
2008-08-22 08:30 UTC, Mark Langsdorf
Details

Description Rafael J. Wysocki 2008-08-09 15:03:30 UTC
Subject    : 2.6.27-rc2-git4: suspend and power off fails on Asus M3A32-MVP
Submitter  : "Rafael J. Wysocki" <rjw@sisk.pl>
Date       : 2008-08-09 21:21
References : http://marc.info/?l=linux-kernel&m=121831675111794&w=4

This entry is being used for tracking a regression from 2.6.26.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Rafael J. Wysocki 2008-08-15 10:57:26 UTC
Handled-By : "Langsdorf, Mark" <mark.langsdorf@amd.com>
Comment 2 Adrian Bunk 2008-08-20 10:40:32 UTC
fixed by commit f607e3a03c90e8c050cb0c12ec9967c2925cc812
Comment 3 Mark Langsdorf 2008-08-22 08:29:34 UTC
Created attachment 17374 [details]
oops message

We used hardware tools to break into a failing system and catch the OOPS.  It's slab corruption on resume, with a slab entry not having any valid links.
Comment 4 Mark Langsdorf 2008-08-22 08:30:21 UTC
Created attachment 17375 [details]
another crash report

I rebuilt the kernel with slab and suspend/resume debugging enabled and tried again.  A different but similar crash report.
Comment 5 Mark Langsdorf 2008-08-22 08:30:57 UTC
Created attachment 17376 [details]
yet another crash report

One more crash report.  This shows the slab corruption again, but doesn't indicate why.
Comment 6 Rafael J. Wysocki 2008-08-22 09:17:42 UTC
This looks similar to the problem described in this e-mail thread:

http://marc.info/?t=121933979400002&r=1&w=4
Comment 7 Rafael J. Wysocki 2008-08-22 09:26:08 UTC
Well, no, it doesn't really.

Is that with SLUB or SLAB?  If this is with SLUB, then corrupting per-CPU memory could lead to that.

Would it be practicable to split the patch reverted by commit f607e3a03c90e8c050cb0c12ec9967c2925cc812 into two patches, one introducing the per-CPU variables and the other one actually causing them to be passed to acpi_processor_preregister_performance() ?  Then, we could verify which part of the original patch causes the problem to happen.
Comment 8 Mark Langsdorf 2008-08-22 10:24:38 UTC
No, it's SLAB.

I'll see what I can do split the patch.  It won't be very useful that way, but I could create a bunch of unused per-cpu variables.
Comment 9 Rafael J. Wysocki 2008-08-22 11:21:56 UTC
(In reply to comment #8)
> No, it's SLAB.
> 
> I'll see what I can do split the patch.  It won't be very useful that way,
> but
> I could create a bunch of unused per-cpu variables.

Well, if they are unused, we won't be able to check if per-CPU memory is corrupted.

In fact, I don't see anything other than some corruption of per-CPU memory that could result from this patch and lead to slab corruption.

Note You need to log in before you can comment on or make changes to this bug.