Bug 13967 - Revert "ACPICA: Remove obsolete acpi_os_validate_address interface" regression
Revert "ACPICA: Remove obsolete acpi_os_validate_address interface" regression
Status: CLOSED DOCUMENTED
Product: ACPI
Classification: Unclassified
Component: Other
All Linux
: P1 normal
Assigned To: acpi_other
:
Depends on:
Blocks: 13615
  Show dependency treegraph
 
Reported: 2009-08-12 06:48 UTC by Markus Trippelsdorf
Modified: 2010-03-08 21:35 UTC (History)
12 users (show)

See Also:
Kernel Version: 2.6.31-rc5
Tree: Mainline
Regression: Yes


Attachments
acpidump (281.75 KB, text/plain)
2009-08-12 06:48 UTC, Markus Trippelsdorf
Details

Description Markus Trippelsdorf 2009-08-12 06:48:48 UTC
Created attachment 22686 [details]
acpidump

The revert of "ACPICA: Remove obsolete acpi_os_validate_address interface"
(commit f9ca058430333c9a24c5ca926aa445125f88df18) breaks it8720 hardware
monitoring on my machine:

ACPI: I/O resource it87 [0x295-0x296] conflicts with ACPI region ECRE [0x290-0x2af]                                                                ACPI: Device needs an ACPI driver  

I never had problems with hardware monitoring before and this post rc5
revert stops lmsensors from working.
Comment 1 Zhang Rui 2009-08-13 03:36:32 UTC
We need to prevent the lmsensor from loading on this machine because ACPI also pokes the same resource, which may bring some potential bugs.
As the lmsensors always work well together with ACPI, you can try boot option "acpi_enforce_resources=no" to load the native driver like before.
Comment 2 Len Brown 2009-08-13 04:13:34 UTC
Markus,
Have you run 2.6.29?
This revert should return 2.6.31 to be exactly like 2.6.29
Comment 3 Markus Trippelsdorf 2009-08-13 06:31:42 UTC
acpi_enforce_resources=no does work here with the current git kernel.

I cannot test 2.6.29 easily, because I use btrfs as my root fs and
there were incompatible format changes in 2.6.31...
Comment 4 Markus Trippelsdorf 2009-08-13 06:44:37 UTC
OK I've booted Ubuntu (2.6.28-14) and everything works fine without
"acpi_enforce_resources=no".
Comment 5 Zeev Tarantov 2009-08-15 10:58:53 UTC
This is a (part of a) diff on dmesg between booting 2.6.31-rc5 and rc6, same config. rc5 works fine. This is a regression in 2.6.31-rc6. I haven't bisected but it seems to be the same thing. Thanks Markus Trippelsdorf for noticing.

@@ -554,6 +553,9 @@
  rtc0: alarms up to one month, y3k, 114 bytes nvram, hpet irqs
  i2c /dev entries driver
  i801_smbus 0000:00:1f.3: PCI INT C -> GSI 18 (level, low) -> IRQ 18
+ ACPI: I/O resource 0000:00:1f.3 [0x400-0x41f] conflicts with ACPI region SMRG [0x400-0x40f]
+ ACPI: Device needs an ACPI driver
+ i801_smbus: probe of 0000:00:1f.3 failed with error -16
  Linux video capture interface: v2.00
  usbcore: registered new interface driver stv680
  stv680 [usb_stv680_init:1551]
@@ -562,7 +564,8 @@
  coretemp coretemp.0: Using relative temperature scale!
  coretemp coretemp.1: Using relative temperature scale!
  w83627ehf: Found W83627DHG chip at 0x290
- w83627ehf w83627ehf.656: VID pins in output mode, CPU VID not available
+ ACPI: I/O resource w83627ehf [0x295-0x296] conflicts with ACPI region HWRE [0x290-0x299]
+ ACPI: Device needs an ACPI driver
  device-mapper: uevent: version 1.0.3
  device-mapper: ioctl: 4.15.0-ioctl (2009-04-01) initialised: dm-devel@redhat.com
  cpuidle: using governor ladder
Comment 6 Zhang Rui 2009-08-17 02:41:43 UTC
Markus,

You're true that the lmsensor driver is always loaded before 2.6.30-rc1.

But note that commit 7e90560c50f754d65884e251e94c1efa2a4b5784 is shipped in 2.6.30-rc1 because we don't want to load the native driver (lmsensor driver) when there is a resource conflict between native driver and ACPI.

Unfortunately commit f9ca058430333c9a24c5ca926aa445125f88df18 is also introduced in 2.6.30-rc1, which makes 7e90560c50f754d65884e251e94c1efa2a4b5784 a NULL.

Now we revert commit f9ca058430333c9a24c5ca926aa445125f88df18 and kernel behaves the same as it should do since 2.6.30-rc1.
Comment 7 Zeev Tarantov 2009-08-17 14:55:15 UTC
So the intent was to break lm_sensors in 2.6.30, but because of a bug it continued to work as the user expects and will only break in 2.6.31.

Users who want to keep using lm_sensors on boards that do not spontaneously shutdown due to conflict between lm_sensors and the BIOS (this I believe is a majority of users) will need to add a kernel boot parameter to signal the paranoid kernel that their computer is sane, or if they're ASUS users they can switch to the "ASUS ATK0110 ACPI hwmon" driver (http://marc.info/?l=linux-kernel&m=125043293401414).

I suppose lm_sensors team can add this workaround to the installation documentation, and all distros will just ship a patch that adds the boot parameter if a non-ACPI hwmon driver is required for the computer, and if a user has spontaneous shutdown syndrome they'll be told by tech support to disable the boot parameter and live without temperature monitoring and/or complain to manufacturer to get BIOS/linux driver for ACPI hwmon.
Comment 8 Stefan Richter 2009-08-17 17:38:59 UTC
Re comment #7:
> if they're ASUS users they can
> switch to the "ASUS ATK0110 ACPI hwmon" driver

This works perhaps on some but not on all ASUS boards.  (Not on mine for example.) 

I take the liberty to switch the bug status once more, now from INVALID to WILL_NOT_FIX to properly reflect the actual status of this bug:  Support of hardware which previously worked (presumably with some luck) has been deliberately disabled (unless users specify a magic kernel command line parameter), while the ACPI sensors drivers which are required to replace affected legacy drivers will supposedly appear out of thin air any time now.
Comment 9 Zhang Rui 2009-08-18 07:45:59 UTC
(In reply to comment #8)
> I take the liberty to switch the bug status once more, now from INVALID to
> WILL_NOT_FIX to properly reflect the actual status of this bug:  Support of
> hardware which previously worked (presumably with some luck) has been
> deliberately disabled (unless users specify a magic kernel command line
> parameter), while the ACPI sensors drivers which are required to replace
> affected legacy drivers will supposedly appear out of thin air any time now.

In fact, there is no ACPI sensor driver, but some ACPI AML code may poke the same resource at runtime.
Comment 10 Jean Delvare 2009-08-30 07:49:08 UTC
(In reply to comment #7)
> I suppose lm_sensors team can add this workaround to the installation
> documentation, and all distros will just ship a patch that adds the boot
> parameter if a non-ACPI hwmon driver is required for the computer, and if a
> user has spontaneous shutdown syndrome they'll be told by tech support to
> disable the boot parameter and live without temperature monitoring and/or
> complain to manufacturer to get BIOS/linux driver for ACPI hwmon.

No, this will not happen. You seem to think that the evil ACPI folks have broken lm-sensors and this makes the lm-sensors team and all Linux distributions sad and angry. This is the other way around: the lm-sensors team _asked_ the nice ACPI folks for the resource conflict check, under pressure by Linux distributions, the support teams of which were tired of receiving an increasing number of bugs about thermal management breakages caused by ACPI vs native drivers conflicts.

So I am fairly certain that no distribution will make acpi_enforce_resources=no, nor even lax, the default. If anything, some distributions had set their default to strict even before the upstream kernel did. We aim at system stability and reliability first. If this means no hardware monitoring support on some systems, this is certainly sad and I wouldn't want such a machine for myself, but this is still the way to go.

If this makes you unhappy because your system worked fine before, blame it on your mainboard vendor who shipped a broken BIOS which requests I/O regions it doesn't use.

Note that we have already documented the problem, and its workaround for users who wish to apply it, at:
http://hansdegoede.livejournal.com/7932.html
Comment 11 Jean Delvare 2009-08-30 08:08:19 UTC
(In reply to comment #9)
> In fact, there is no ACPI sensor driver, but some ACPI AML code may poke the
> same resource at runtime.

There are ACPI sensor drivers for some machines: asus_atk0110, eeepc-laptop and thinkpad_acpi. However, I agree that the message "ACPI: Device needs an ACPI driver" is misleading. The device _may_ need an ACPI driver, if the BIOS implemented an API for the device in question (which, AFAIK, can't be checked.) If not, then either the generic "thermal" ACPI driver may be used, or nothing can be done (other than a white list, if we really want to get hardware monitoring on some machines.)

I'll send a patch clarifying this log message.
Comment 12 Stefan Richter 2009-08-30 10:09:26 UTC
Re comment 8:
>>>> This works perhaps on some but not on all ASUS boards.  (Not on
>>>> mine for example.)

PS: "mine" = Asus M3A78-EM with BIOS versions 0701 (8/28/2008) and 1902 (7/14/2009).

Re comment 10:
>> Note that we have already documented the problem, and its workaround
>> for users who wish to apply it, at:
>> http://hansdegoede.livejournal.com/7932.html

It's probably helpful to update this with the factoid that many people see this regression only since mainline 2.6.31, not already in mainline 2.6.30.  (Mentioned in comment 6 and comment 7, also according to my and others' observation.)
Comment 13 Michael Tokarev 2009-09-09 10:34:40 UTC
M3A78-EM mobo here too.  Was rock solid for 1.5years of heavily use already.  I had a script that monitors system themeratures and fans and adjusts everything (like fancontrol).  The fan starts at very low speed in bios, and at higher load linux speeds it up.  But today I faced a few reboots in a row and overall system instability.  After trying to understand what's goin on I discovered that sensors does not work anymore and my script fails to speed fans and hence the system overheats.  Sure it's a bug in my script -- I added some checks and explicit shutdown into it.  But hey, it's unwise to fry someone's computers this way!.. (so far thermal protection worked so nothing broke, I just lost a few KBs of unsaved data)

Note that neither asus_atk0110 driver nor the mentioned kernel command line works.  asus_atk0110 loads but does nothing and does not find any sensors, and it87 does not load due to that ACPI resource conflict.
Comment 14 Michael Tokarev 2009-09-09 10:40:15 UTC
M3A78-EM mobo here too.  Was rock solid for 1.5years of heavily use already.  I had a script that monitors system themeratures and fans and adjusts everything (like fancontrol).  The fan starts at very low speed in bios, and at higher load linux speeds it up.  But today I faced a few reboots in a row and overall system instability.  After trying to understand what's goin on I discovered that sensors does not work anymore and my script fails to speed fans and hence the system overheats.  Sure it's a bug in my script -- I added some checks and explicit shutdown into it.  But hey, it's unwise to fry someone's computers this way!.. (so far thermal protection worked so nothing broke, I just lost a few KBs of unsaved data)

Note that neither asus_atk0110 driver nor the mentioned kernel command line works.  asus_atk0110 loads but does nothing and does not find any sensors, and it87 does not load due to that ACPI resource conflict.

Tried the latest bios from Asus (1902), but it only adds new CPU support, and the problem stays the same.  Also observed the same issue on M3A-H/HDMI motherboard, but there I didn't try boot option yet (asus_atk0110 does not work there either).

For the original reason of conflict - I *guess* that it's the way how Asus implements its Q-Fan (and Q-Fan2 on more expensive boards) feature, like poking at it87 registers to change fan speeds and read themps when enabled.  Here it's disabled (because the feature does not work the way I want it to work).  More, any write to it87 pwm registers immediately disables q-fan, so it stops changing fan speed automatically.
Comment 15 Jean Delvare 2009-09-11 07:18:11 UTC
Michael, maybe your script is not perfect, but fancontrol itself is pretty vulnerable to changes in hardware monitoring devices on the kernel side. It merely assumes that /sys/class/hwmon entries are always the same and in the same order. When switching from it87 to asus_atk0110, this assumption is no longer correct. This has already been reported here:
https://bugzilla.novell.com/show_bug.cgi?id=529483

I'm working on this, and future versions of the fancontrol script should be much more robust, but unfortunately it doesn't help with the installed user base.

> Note that neither asus_atk0110 driver nor the mentioned kernel command
> line works.  asus_atk0110 loads but does nothing and does not find any
> sensors, and it87 does not load due to that ACPI resource conflict.

It is known that the asus_atk0110 driver doesn't work on all Asus boards. However the boot parameter "acpi_enforce_resources=lax" definitely works on all systems, so maybe you made a typo. Please try again.

As for Q-Fan, to the best of my knowledge it is only a marketing name on top of ITE's or Winbond's automatic fan speed control technologies. This means that the monitoring chip is programmed at boot time for a given thermal -> fan speed response strategy and there is no reason to poll the registers after that.
Comment 16 Michael Tokarev 2009-09-15 04:48:27 UTC
> It is known that the asus_atk0110 driver doesn't work on all Asus boards.
> However the boot parameter "acpi_enforce_resources=lax" definitely works on all
> systems, so maybe you made a typo. Please try again.

Ok, I stand corrected.  It really was a typo on my side.  I wonder how it slipped in, since I cut-n-pasted it from some other place -- probably original was mistyped as well.  Right now I've acpi_enforce_resources=no, but you suggest =lax - from the description in kernel-parameters.txt "lax" sounds better.

Oh well.
Comment 17 Billy DeVincentis 2009-09-18 04:21:24 UTC
Does anyone know if the asus_atk0110 driver works on an asus a8n32-sli deluxe? I would guess that unless the asus website lists the driver software for atk0110 for windows systems under their motherboard software, the linux kernel driver won't work for that board but I may not fully understand this.
Comment 18 Jean Delvare 2009-09-18 07:38:30 UTC
Billy, I didn't know Asus had such a list. This would be very interesting for us as a reference, care to provide a link?
Comment 19 Michael Tokarev 2009-09-18 08:56:11 UTC
there's no such list per se.  But each motherboard on their site has several assotiated links, including "Downloads", which lists Asus-provided software, drivers and BIOSes (it's actually support.asus.com).  So it's motherboard-specific.  For example, for my M3A78-EM motherboard no ATK0110-related software is listed, but for the above mentioned A8N32-SLI Deluxe some ATX software IS listed, here:
http://support.asus.com/download/download.aspx?model=A8N32-SLI%20Deluxe&os=17&SLanguage=en-us
"ACPI driver for ATK 0110 virtual device for Windows 2000/XP(32bit and 64bit)/2003(32bit & 64bit)/VISTA(32bit & 64bit)"
Comment 20 Jean Delvare 2009-09-18 09:34:39 UTC
Ah, OK, thanks for the clarification. This is still useful for per-model checks.
Comment 21 Billy DeVincentis 2009-09-18 12:20:56 UTC
That's interesting. Truth is since I no longer use my windows partition on that box, I never checked. Maybe it's time I push the developers to include the higher version of lm_sensors on my distro (Gentoo). Maybe just maybe the kernel driver asus_atk0110 will work for my mboard and eliminate my need to use lax acpi enforcement.
Comment 22 Zeev Tarantov 2009-09-18 14:43:16 UTC
I've never looked at the Windows drivers for my board, but my ASUS P5K-E/WiFi-AP does have such a Windows driver listed:
http://support.asus.com/download/download.aspx?model=P5K-E/WiFi-AP&SLanguage=en-us

And it does work with the Linux asus_atk0110 driver. Maybe it is a good indicator.
Comment 23 Billy DeVincentis 2009-09-18 19:31:25 UTC
It seems that they list the same Windows driver for both of our boards so I would like to believe that the linux driver should also work. Zeev, what version of lm_sensors do you have installed?
Comment 24 Jean Delvare 2009-09-18 19:36:37 UTC
You need lm-sensors >= 3.1.0 for Asus ATK0110 support.
Comment 25 Billy DeVincentis 2009-09-18 22:10:09 UTC
You are absolutely right, actually I am using 3.1.1 and I have spent some time preparing the ebuilds for our fellow gentoo community and I can confirm that it works perfectly.

See here if you are a gentoo user and need ebuilds to make this work

http://bugs.gentoo.org/show_bug.cgi?id=244598
Comment 26 Captcha 2009-10-12 06:36:39 UTC
same here with Ubuntu Karmic
Kernel 2.8.31-13
lm-sensors 1:3.0.2-2ubuntu4

w83627hf: Found W83627HF chip at 0x290
ACPI: I/O resource w83627hf [0x295-0x296] conflicts with ACPI region IP__ [0x295-0x296]
ACPI: Device needs an ACPI driver

not possible to see sensor information
Comment 27 Yves-Alexis Perez 2009-12-12 15:44:52 UTC
Hmhm, I have the same problem on an Acer Power 2000. it87 won't load, and I don't really know which driver I would be supposed to use in place. 

My guess is there is none, and I'll have to use the acpi_enforce_resources=lax parameter, which I'm not really comfortable with...

My error message is:

[92797.051701] it87: Found IT8718F chip at 0x290, revision 2
[92797.051714] it87: in3 is VCC (+5V)
[92797.051716] it87: in7 is VCCH (+5V Stand-By)
[92797.051765] ACPI: I/O resource it87 [0x295-0x296] conflicts with ACPI region IP__ [0x295-0x296]
Comment 28 Ant 2010-03-08 21:35:31 UTC
FYI. http://pastie.org/860259 for my dmesg since I have the same issue. :(

Note You need to log in before you can comment on or make changes to this bug.