Bug 12464 - PAT: duplicated lines in /sys/kernel/debug/x86/pat_memtype_list
Summary: PAT: duplicated lines in /sys/kernel/debug/x86/pat_memtype_list
Status: CLOSED INVALID
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: All Linux
: P1 low
Assignee: platform_x86_64@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-17 03:09 UTC by Frans Pop
Modified: 2009-01-19 15:48 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.28
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Contents of /sys/kernel/debug/x86/pat_memtype_list (after X.Org crash) (1.66 KB, text/plain)
2009-01-17 03:11 UTC, Frans Pop
Details
Contents of /proc/mtrr (does not change) (350 bytes, text/plain)
2009-01-17 03:11 UTC, Frans Pop
Details
Full X.Org log (51.76 KB, application/x-trash)
2009-01-17 03:12 UTC, Frans Pop
Details
Kernel config (63.89 KB, text/plain)
2009-01-17 03:14 UTC, Frans Pop
Details

Description Frans Pop 2009-01-17 03:09:05 UTC
I've been seeing regular crashes of X.Org with 2.6.28, each with the same symptoms. However, I've not yet been able to reproduce the issue on demand.

The symptoms are:
- X.Org crashes after starting VirtualBox (both versions 2.0.1 and 2.1.0)
- after that, the display on VT1 is corrupted when I switch to that

One thing that I have discovered is that when the problem occurs, I have a duplicate entry in /sys/kernel/debug/x86/pat_memtype_list which is not there when the system is OK. The diff is (full list attached):

@@ -1,5 +1,6 @@
 PAT memtype list:
 uncached-minus @ 0x7e7b0000-0x7e7b1000
+uncached-minus @ 0x7e7b0000-0x7e7b1000
 uncached-minus @ 0x7e7c4000-0x7e7c5000
 uncached-minus @ 0x7e7c5000-0x7e7c8000
 uncached-minus @ 0x7e7c5000-0x7e7c7000

I'm not sure if the duplicate line gets added before or after the X.Org crash.

I am using the VESA framebuffer on VT1 (vga=791). The corruption is that after a console switch most of the display is black, at the bottom there is an area of grey horizontal stripes, and at the top there is a small bar where the text of the console looks to be displayed, but in an extremely small font (unreadable, but you can just recognize console messages "scrolling" in that area).
Switching back from VT1 to X.Org works correctly, and if I log in to X.Org 
again and retry VirtualBox it works correctly. A reboot is needed to get VT1 displayed correctly again.

The problem occurs on a notebook which I use both undocked and docked. When docked, an external monitor is connected and I use xrandr for dual-display (Xinerama, with external monitor as display 1 and notebook monitor as display 2). I switch the notebook display on and off as needed when docked.

Most of the time starting VirtualBox does not cause any problem and console switching to VT1 is OK too. My impression is that the error only occurs after I have undocked or docked the notebook. I have a script running that checks for dock/undock events and then calls xrandr to automatically enable/disable the external display.

System is HP 2510p notebook running 2.6.28 x86_64, Debian Lenny, KDE desktop.

I will now try running 2.6.28 with 'nopat' and see if I then still see the issue; I will also try 2.6.29 when I can.

----
From X.Org log after crash:
Backtrace:
0: /usr/bin/X(xf86SigHandler+0x6a) [0x48dd0a]
1: /lib/libc.so.6 [0x7fe702829f60]
2: /usr/bin/X(VidModeGetFirstModeline+0x5b) [0x49053b]
3: /usr/bin/X(VidModeGetNumOfModes+0x42) [0x490772]
4: /usr/lib/xorg/modules/extensions//libextmod.so [0x7fe701c60040]
5: /usr/bin/X(Dispatch+0x342) [0x44f7e2]
6: /usr/bin/X(main+0x4a5) [0x436bd5]
7: /lib/libc.so.6(__libc_start_main+0xe6) [0x7fe7028161a6]
8: /usr/bin/X(FontFileCompleteXLFD+0x281) [0x435e99]

Fatal server error:
Caught signal 11.  Server aborting
Comment 1 Frans Pop 2009-01-17 03:11:03 UTC
Created attachment 19846 [details]
Contents of /sys/kernel/debug/x86/pat_memtype_list (after X.Org crash)
Comment 2 Frans Pop 2009-01-17 03:11:40 UTC
Created attachment 19847 [details]
Contents of /proc/mtrr (does not change)
Comment 3 Frans Pop 2009-01-17 03:12:36 UTC
Created attachment 19848 [details]
Full X.Org log
Comment 4 Frans Pop 2009-01-17 03:14:32 UTC
Created attachment 19850 [details]
Kernel config
Comment 5 Andrew Morton 2009-01-17 04:41:02 UTC
Reassigned to x86_64.

Is this a regression?  Was any previous kernel OK?
Comment 6 Frans Pop 2009-01-17 05:06:02 UTC
I don't have any data for previous kernel versions, so I cannot say if it is a regression or not. I could try .27 if I find a reliable way to reproduce.
Comment 7 Frans Pop 2009-01-18 09:36:42 UTC
I've now found a way to reproduce this and the good news is that it is unrelated to PAT: I can also reproduce it after booting with 'nopat'.

The way to reproduce it is:
- boot with laptop docked
- log in to KDE and activate dual display with:
  xrandr --output VGA --left-of LVDS --mode 1280x1024
- disable the laptop display:
  xrandr --output LVDS --off
- undock the laptop, which results in:
  xrandr --output LVDS --auto --output VGA --off
- start VirtualBox virtual machine

If I dock the laptop again before starting VirtualBox, the X.Org crash does not happen. It also does not happen if the laptop display is enabled before I undock it.
Also, the display corruption on VT1 only happens after the X.Org crash, not if I switch consoles before starting VirtualBox.


The added line in pat_memtype_list looks to be unrelated, and is even somewhat "normal". Even after I have just booted my laptop (with KDE running and dual-display active), I see a number of duplicate lines in that file:

# cat /sys/kernel/debug/x86/pat_memtype_list | sort | uniq -c
      1 PAT memtype list:
      1 uncached-minus @ 0x7e7b0000-0x7e7b1000
      1 uncached-minus @ 0x7e7c4000-0x7e7c5000
      1 uncached-minus @ 0x7e7c5000-0x7e7c7000
      1 uncached-minus @ 0x7e7c5000-0x7e7c8000
!!    7 uncached-minus @ 0x7e7c8000-0x7e7c9000
      1 uncached-minus @ 0x7e7c8000-0x7e7dc000
      1 uncached-minus @ 0x7e7db000-0x7e7dc000
!!    3 uncached-minus @ 0x7e7dc000-0x7e7dd000
!!    2 uncached-minus @ 0x7e7e7000-0x7e7e8000
      1 uncached-minus @ 0x7e7e9000-0x7e7ea000
      1 uncached-minus @ 0x88000000-0x88001000
      1 uncached-minus @ 0xd0000000-0xd0020000
      1 uncached-minus @ 0xd0000000-0xd0300000
      1 uncached-minus @ 0xd0000000-0xe0000000
      1 uncached-minus @ 0xe0000000-0xe0002000
      1 uncached-minus @ 0xe0100000-0xe0101000
      1 uncached-minus @ 0xe0101000-0xe0102000
      1 uncached-minus @ 0xe0102000-0xe0103000
!!    3 uncached-minus @ 0xe0400000-0xe0480000
      1 uncached-minus @ 0xe0400000-0xe0500000
!!    2 uncached-minus @ 0xe0480000-0xe0500000
      1 uncached-minus @ 0xe0620000-0xe0640000
      1 uncached-minus @ 0xe0640000-0xe0641000
      1 uncached-minus @ 0xe0641000-0xe0642000
      1 uncached-minus @ 0xe0644000-0xe0648000
      1 uncached-minus @ 0xe0648000-0xe0649000
      1 uncached-minus @ 0xf8000000-0xfc000000
      1 uncached-minus @ 0xfed00000-0xfed01000
!!    2 uncached-minus @ 0xfed93000-0xfed94000

The first time I undock and redock the laptop just adds one extra duplicate line. Is this something that should be looked into?
Comment 8 Frans Pop 2009-01-18 09:57:51 UTC
I've just tested this with 2.6.26.3 and with that kernel I can also reproduce the X.Org crash, so it is definitely not a recent kernel regression.

I have also found that the X.Org crash does not happen if after undocking I first enable the laptop display and only then disable the external monitor.
So if instead of:
xrandr --output LVDS --auto --output VGA --off
I do:
xrandr --output LVDS --auto
xrandr --output VGA --off

Looks to me like this is an X.Org issue, and not a kernel bug. So if the duplicated lines in pat_memtype_list are not an issue, this BR can be closed.
Comment 9 Suresh B Siddha 2009-01-19 15:07:39 UTC
Duplicate lines in pat_memtype_list doesn't really indicate an issue. It just indicates that more than one user has mapped that address range with the corresponding attribute.
Comment 10 Frans Pop 2009-01-19 15:48:04 UTC
OK, let's close it then.

JFTR, the X.Org issue is now: http://bugs.freedesktop.org/show_bug.cgi?id=19643

Note You need to log in before you can comment on or make changes to this bug.