Bug 201721
Summary: | fancontrol does not work - regression | ||
---|---|---|---|
Product: | Drivers | Reporter: | Walther Pelser (w.pelser) |
Component: | Hardware Monitoring | Assignee: | Jean Delvare (jdelvare) |
Status: | RESOLVED INVALID | ||
Severity: | normal | CC: | erik.kaneda, linux |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 4.19.2 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
part of yast2 systemd-journal
part of yast2 systemd-journal (right one) 4.19.1 fancontrol.pid 4.19.1 pwmconfig.txt 4.19.2 pwmconfig.txt strace pwmconfig fancontrol output of dmesg 4.19.1 output of dmesg 4.19.2 older kernel dmesg kernel 4.16.13 dmesg unpatched dsopcode.c (taken from 4.19.1) dmesg kernel 4.19.2 acpi_enforce_resources=lax fancontrol 4.19.1 fancontrol acpi_enforce_resources=lax 4.19.2 + 4.16.13 |
Created attachment 279499 [details]
part of yast2 systemd-journal (right one)
Which hwmon driver(s) are you using? Please attach the output of pwmconfig with both kernels. Which hwmon driver(s) are you using? I use sensors-detect and then pwmconfig, or do you mean something else? When kernel 4.19.2 is in use, there is no fancontrol.pid Created attachment 279531 [details]
4.19.1 fancontrol.pid
Created attachment 279533 [details]
4.19.1 pwmconfig.txt
Created attachment 279535 [details]
4.19.2 pwmconfig.txt
Created attachment 279537 [details]
strace pwmconfig
Two possible causes (commit log between 4.19.1 and 4.19.2): 5764ffc8a643 hwmon: (pwm-fan) Set fan speed to 0 on suspend 43cba96d9505 hwmon: (pmbus) Fix page count auto-detection. I don't immediately see how any of those could result in the observed problems. Unfortunately, the submitter did not tell which hwmon driver(s) are in use, much less provide any information about the affected system. Differences in instantiated hwmon devices and raw attribute names and values would have helped as well, as might have differences in system configuration and the output of dmesg. We don't even know if this is a PC, an embedded arm system, or something else. Without additional information I don't think there is anything we can do. OS is openSUSE-Tumbleweed Driver see attachment Created attachment 279543 [details]
fancontrol
Created attachment 279545 [details]
output of dmesg 4.19.1
with line
[ 27.615807] w83627ehf w83627ehf.656: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
Created attachment 279547 [details]
output of dmesg 4.19.2
missing line
[ 27.615807] w83627ehf w83627ehf.656: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
(In reply to Guenter Roeck from comment #8) > Two possible causes (commit log between 4.19.1 and 4.19.2): > > 5764ffc8a643 hwmon: (pwm-fan) Set fan speed to 0 on suspend > 43cba96d9505 hwmon: (pmbus) Fix page count auto-detection. > > I don't immediately see how any of those could result in the observed > problems. > > Unfortunately, the submitter did not tell which hwmon driver(s) are in use, > much less provide any information about the affected system. Differences in > instantiated hwmon devices and raw attribute names and values would have > helped I need a little help, how to get them,mentioned above as wellas might have differences in system configuration and the > output of dmesg. We don't even know if this is a PC, an embedded arm system, > or something else. > > Without additional information I don't think there is anything we can do. Thanks See commit 111650510 ("ACPICA: AML interpreter: add region addresses in global list during initialization"). From its description: "This commit may result in warning messages that look like the following: [ 7.871761] ACPI Warning: system_IO range 0x00000428-0x0000042F conflicts With op_region 0x00000400-0x0000047F (\PMIO) (20180531/utaddress-213) [ 7.871769] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver However, these messages do not signify regressions. It is a result of properly adding address ranges within the global address list. " Remedy would be to boot with acpi-enforce_resources=lax. The right spelling of the option is: acpi_enforce_resources=lax (underscore, not dash). You got to love the note "these messages do not signify regressions" in the commit message. From a functional perspective, it totally is a regression. And the patch does not fix any actual bug. It references bug #200011 in the commit message, because the issue was noticed while investigating this bug, however the commit does NOT fix that bug (see https://bugzilla.kernel.org/show_bug.cgi?id=200011#c65 ). Walther, for completeness, is this a new system, or have you been running older kernels on it? The information I got from the ACPICA guys suggests that kernels up to 4.16 behaved the same as 4.19.3. It would be great if you could tell us whether or not such old kernels were working for you, so that we can confirm we are all talking about the same thing. Sorry I meant 4.19.2, not 4.19.3, in previous comment. @ Jean It is not a new system. When I installed 4.19.2 the running kernel was 4.19.1. As I can not solve the fancontrol problem with 4.19.2, I run 4.19.1 again and everything works fine again. The "missing line" in comment #12 means for me, that the driver is not properly installed, but I could be wrong. At the moment I try to build my own kernel 4.19.2 without the two patches mentioned in comment #8, but there are still problems to get it run, as it is a localyesconfig-kernel. My question really is if you have ever been running a kernel older than 4.19.1 on this machine, and if fancontrol was working back then. Sorry for this misunderstanding There were a lot of older kernels with no problems. Booting with "acpi-enforce_resources=lax" does not change anything. I made self compiled 4.19.2 without this patches: 5764ffc8a643 hwmon: (pwm-fan) Set fan speed to 0 on suspend 43cba96d9505 hwmon: (pmbus) Fix page count auto-detection by exchanging pmbus.c and pwm-fan.c with the files from 4.19.1. But fancontrol does not work. And also the pre compiled kernels from openSUSE are having the same problem for me. As Jean had pointed out in #16, it should have been "acpi_enforce_resources=lax". Sorry for that. Hi Walter, Could you try a kernel that is older than 4.17 and post the dmesg of it in a working state? This means that the fan isn't going out of control. Walther, I don't mean to be rude but it would really help us help you if you would follow the discussion to avoid testing things which we already know will not work. (In reply to Walther Pelser from comment #22) > Booting with "acpi-enforce_resources=lax" does not change anything. That was expected, see comment #15 for the right spelling. > I made self compiled 4.19.2 without this patches: > 5764ffc8a643 hwmon: (pwm-fan) Set fan speed to 0 on suspend > 43cba96d9505 hwmon: (pmbus) Fix page count auto-detection > by exchanging pmbus.c and pwm-fan.c with the files from 4.19.1. > But fancontrol does not work. Again, that was expected. See comment #14 for the patch which is believed to cause your problem. While the commit ID mentioned by Guenter seems incorrect, a quick search gives the correct commit ID: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=22083c028d0b3ee419232d25ce90367e5b25df8f That's the commit you should try reverting. Created attachment 279611 [details]
older kernel dmesg
Created attachment 279613 [details]
kernel 4.16.13 dmesg
fancontrol does not work
very surprisingly
Created attachment 279615 [details]
unpatched dsopcode.c (taken from 4.19.1)
fancontrol is working again
(In reply to Erik Schmauss from comment #24) > Hi Walter, > > Could you try a kernel that is older than 4.17 and post the dmesg of it in a > working state? This means that the fan isn't going out of control. I have no idea, why fanconrol is not working. I used a precompiled kernel from openSUSE (https://build.opensuse.org/package/show/home%3Atiwai%3Akernel%3A4.16/kernel-default). So I can't help in this case. (In reply to Guenter Roeck from comment #23) > As Jean had pointed out in #16, it should have been > "acpi_enforce_resources=lax". Sorry for that. I had noticed that comment from Jean. So it was a typo again, but now my own. Would it be usefully to try it again with the right "acpi_enforce_resources=lax"? (In reply to Jean Delvare from comment #25) > Walther, I don't mean to be rude but it would really help us help you if you > would follow the discussion to avoid testing things which we already know > will not work. Rudeness without excuse has become great problem for me. (This does not point at you!). So I avoid filing bugs and try to solve my software problems in private. > > (In reply to Walther Pelser from comment #22) > > Booting with "acpi-enforce_resources=lax" does not change anything. > > That was expected, see comment #15 for the right spelling. My typo > > > I made self compiled 4.19.2 without this patches: > > 5764ffc8a643 hwmon: (pwm-fan) Set fan speed to 0 on suspend > > 43cba96d9505 hwmon: (pmbus) Fix page count auto-detection > > by exchanging pmbus.c and pwm-fan.c with the files from 4.19.1. > > But fancontrol does not work. > > Again, that was expected. See comment #14 for the patch which is believed to > cause your problem. While the commit ID mentioned by Guenter seems > incorrect, a quick search gives the correct commit ID: > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/ > ?id=22083c028d0b3ee419232d25ce90367e5b25df8f I always have had this warning with no real problems: [ 7.871761] ACPI Warning: system_IO range 0x00000428-0x0000042F conflicts With op_region 0x00000400-0x0000047F (\PMIO) (20180531/utaddress-213) [ 7.871769] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver So I did not mention, that it had to make with the patch, because the remedy was in my eyes no real solution. > > That's the commit you should try reverting. See attachment "unpatched dsopcode.c (taken from 4.19.1)". Thanks for your efforts. Walther The problem is [ 28.802635] ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\_SB.PCI0.SBRG.SIOR.HWRE) (20180105/utaddress-247) [ 28.802642] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver not 428-42F. The above message is also seen with 4.16.13, so the driver should not instantiate there either. (In reply to Walther Pelser from comment #30) > I had noticed that comment from Jean. So it was a typo again, but now my > own. Would it be usefully to try it again with the right > "acpi_enforce_resources=lax"? Yes, of course. Created attachment 279659 [details]
dmesg kernel 4.19.2 acpi_enforce_resources=lax
Please look first at line 4, whether I implemented "acpi_enforce_resources=lax" in the right way into the boot options. If yes, the result is: fancontrol does not work. Otherwise give me an example, how to add in in the right way.
#34: Looks ok, and the subsequent warning suggests that the driver is indeed instantiated. (In reply to Walther Pelser from comment #27) > Created attachment 279613 [details] > kernel 4.16.13 dmesg > > fancontrol does not work > very surprisingly Correct me if I'm wrong but this means that your fancontrol never really worked aside from 4.17 through 4.19. So the claim that you made comment #21 is incorrect. Is that right? I'm just trying to understand the situation. Thanks, Erik (In reply to Erik Schmauss from comment #36) > (In reply to Walther Pelser from comment #27) > > Created attachment 279613 [details] > > kernel 4.16.13 dmesg > > > > fancontrol does not work > > very surprisingly > > Correct me if I'm wrong but this means that your fancontrol never really > worked aside from 4.17 through 4.19. So the claim that you made comment #21 > is incorrect. Is that right? > > I'm just trying to understand the situation. > > Thanks, > Erik You are right. I started with fancontrol this year and it worked all the time. I was forgotten this. So I wrote with "a lot of older kernels", which was not the case in connection with fancontrol. (In reply to Erik Schmauss from comment #36) > (In reply to Walther Pelser from comment #27) > > Created attachment 279613 [details] > > kernel 4.16.13 dmesg > > > > fancontrol does not work > > very surprisingly > > Correct me if I'm wrong but this means that your fancontrol never really > worked aside from 4.17 through 4.19. So the claim that you made comment #21 > is incorrect. Is that right? > > I'm just trying to understand the situation. > > Thanks, > Erik Now fancontrol works with 4.16.13 too. With "acpi_enforce_resources=lax" the fancontrol file has changed. The old one no longer works. You have to run pwmconfig again, which is now possible. The same fancontrol file works with 4.16.13 and 4.19.2 Created attachment 279671 [details]
fancontrol 4.19.1
Created attachment 279673 [details]
fancontrol acpi_enforce_resources=lax 4.19.2 + 4.16.13
With acpi_enforce_resources=lax there comes another problem for me in connection to Thermal Monitor (KDE Plasma). It could be seen in fancontrol too. lmsensor atk0110-acpi is no longer available. For me this is no a good development. ATK0110 is an ACPI interface on top of your hardware monitoring chip. It presents (more or less) the same information as the native w83627ehf driver. You should never run both drivers are the same time (and the ACPI resource conflict detection is meant to prevent it). The only functional difference between the acpi_atk0110 driver and the w83627ehf driver is that the former does not support manual fan speed control, and as such can't be used as a backend for the fancontrol script. However I'm pretty certain that Asus implements automatic fan speed control profiles in the BIOS, which are more efficient than a software daemon. So the right thing to do for your system is to use the asus_atk0110 driver (which means you should NOT pass "acpi_enforce_resources=lax") and select your preferred fan speed control profile in the BIOS. (In reply to Jean Delvare from comment #42) > ATK0110 is an ACPI interface on top of your hardware monitoring chip. It > presents (more or less) the same information as the native w83627ehf driver. > You should never run both drivers are the same time (and the ACPI resource > conflict detection is meant to prevent it). Since the beginning I run fancontrol with both drivers, as sensors-detect can find them. There is a warning, but they are working without any problems. So why should I change a working system? > > The only functional difference between the acpi_atk0110 driver and the > w83627ehf driver is that the former does not support manual fan speed > control, and as such can't be used as a backend for the fancontrol script. So both drivers are needed! > However I'm pretty certain that Asus implements automatic fan speed control > profiles in the BIOS, which are more efficient than a software daemon. So > the right thing to do for your system is to use the asus_atk0110 driver > (which means you should NOT pass "acpi_enforce_resources=lax") and select > your preferred fan speed control profile in the BIOS. My asus-board has only one 4-pin connector for the cpu fan. The chassis fan has only a 3-pin connector. The BIOS controls the cpu fan very good, a software damon would be too dangerous, as I think. But the chassis fan is not properly controlled by the BIOS. Too fast and too loud. Fancontrol can manage this fan as if was 4-pin connector, very good. "acpi_enforce_resources=lax" is needed, to have pwmconfig working. But with it or without it the acpi_atk0110 driver is NOT available with kernel 4.19.2. But in the meantime openSUSE has made changes in their rpm-kernel 4.19.5 , so that fancontrol is running again, as it was with kernel 4.19.1. So far, my problem has gone. Thanks for your answers, but without the help of openSUSE I would have skipped the discussed patch, because it makes more problems for me, than it solves. Walther NEEDINFO is obsolete? Could you change the status? Normally fans can be controlled and managed in the BIOS. One can normally select the temperatures used to control each fan, pwm vs. DC control, and manual vs. automatic operation (including fancy temperature/speed control). I personally don't use ASUS boards, but I would be quite surprised if ASUS would be any different. (In reply to Guenter Roeck from comment #44) > Normally fans can be controlled and managed in the BIOS. One can normally > select the temperatures used to control each fan, pwm vs. DC control, and > manual vs. automatic operation (including fancy temperature/speed control). > I personally don't use ASUS boards, but I would be quite surprised if ASUS > would be any different. You are right regarding newer asus boards. My board is ten years old and the BIOS works as described. It's still a non uefi board. (In reply to Walther Pelser from comment #43) > Since the beginning I run fancontrol with both drivers, as sensors-detect > can find them. There is a warning, but they are working without any > problems. So why should I change a working system? I've been driving without my safety belt forever and never had any problem. Why should I start using a safety belt now? Using 2 drivers for the same device at the same time is simply unsafe. It is racy by design, as the 2 drivers access the same registers without talking to each other. You have been lucky so far, good for you. But someday you will hit the race, and problems will start. Possibly up to a fan stopping completely and your system melting and/or burning. You have been warned. > So both drivers are needed! No. IF you insist on having manual fan speed control then the w83627ehf driver is needed INSTEAD OF the asus_atk0110 driver. So, with your setup, you really want to blacklist the asus_atk0110 driver and run pwmconfig again to reconfigure the fancontrol daemon to use the w83627ehf temperatures as its input. > My asus-board has only one 4-pin connector for the cpu fan. The chassis fan > has only a 3-pin connector. The BIOS controls the cpu fan very good, a > software damon would be too dangerous, as I think. But the chassis fan is > not properly controlled by the BIOS. Too fast and too loud. Fancontrol can > manage this fan as if was 4-pin connector, very good. For completeness, this has little to do with 4-pin vs 3-pin fan connector. The benefit of 4-pin connectors is to allow accurate fan speed monitoring even when fan speed control is in effect and effective speed is very low. But you can still control a fan with a 3-pin connector as long as it never goes in the very low range, or you are not worried about losing monitoring at very low speeds. (In reply to Walther Pelser from comment #45) > You are right regarding newer asus boards. My board is ten years old and the > BIOS works as described. It's still a non uefi board. For even more completeness, this has nothing to do with UEFI or non-UEFI board. You may still want to look for a BIOS update, by the way. Asus may have improved BIOS-based fan speed control at some point. |
Created attachment 279497 [details] part of yast2 systemd-journal With kernel 4.19.2 "fancontrol" does not work. Chassis-fan is always running with max speed. See attachment "part of yast2 systemd-journal". "pwmconfig" can't find a pwm capable fan. Reinstalling kernel 4.19.1 "fancontrol" works fine again.