Distribution: Debian sid Hardware Environment: AMD Athlon on Asus A7N8X-DX Software Environment: kernel 2.6.7. I recall it happening on 2.6.6 and possibly 2.6.5. Problem Description: The w83l785ts loads on bootup, however, no entries are created in sysfs so I can't find out the temperature using the diode of the processor. It doesn't always not create the sysfs entries. I can fix it by rmmod'ing w83l785ts and modprobing it again. Steps to reproduce: Boot/reboot system
gambit:~> lsmod | grep w83l785ts [10:07am/06-21-04] w83l785ts 7556 0 i2c_sensor 3200 2 w83l785ts,asb100 i2c_core 24212 5 w83l785ts,asb100,i2c_sensor,i2c_dev,i2c_nforce2 gambit:/sys/bus/i2c/devices> ls -l [10:08am/06-21-04] total 0 lrwxrwxrwx 1 root root 0 Jun 21 00:37 1-002d -> ../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-002d/ lrwxrwxrwx 1 root root 0 Jun 21 00:37 1-0048 -> ../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-0048/ lrwxrwxrwx 1 root root 0 Jun 21 00:37 1-0049 -> ../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-0049/ gambit:/sys/bus/i2c/devices# rmmod w83l785ts gambit:/sys/bus/i2c/devices# modprobe w83l785ts gambit:/sys/bus/i2c/devices# ls -l total 0 lrwxrwxrwx 1 root root 0 Jun 21 00:37 1-002d -> ../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-002d lrwxrwxrwx 1 root root 0 Jun 21 10:09 1-002e -> ../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-002e lrwxrwxrwx 1 root root 0 Jun 21 00:37 1-0048 -> ../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-0048 lrwxrwxrwx 1 root root 0 Jun 21 00:37 1-0049 -> ../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-0049 gambit:~> lsmod | grep w83l785ts [10:10am/06-21-04] w83l785ts 7556 0 i2c_sensor 3200 2 w83l785ts,asb100 i2c_core 24212 5 w83l785ts,asb100,i2c_sensor,i2c_dev,i2c_nforce2
Several Asus boards are known to mess with the SMBus and W83L785TS-S and cause read errors on the latter. A retry-on-error mechanism has been implemented in the w83l785ts driver in 2.6.3, which was reported to give good results, but obviously not for you. If read errors happend at the moment the w83l785ts driver is attempting to identify your chip (basically while the driver is loading), it will not recognize it and, as a result, you don't see any files under sysfs. Since the errors don't happen all the time, reloading the driver later eventually works. If my analysis is correct, you should see "w83l785ts: Couldn't read value from register. Pease report." messages in system logs when the problem occurs. Can you please confirm? It would also help if you could recompile your kernel with "Device Drivers > I2C Support > I2C Chip debugging messages" enabled (if that's not already the case) so that we can do some stats on the errors and have a chance to improve the mechanism.
I don't remember seeing "w83l785ts: Couldn't read value from register. Pease report." I double checked my logs, and didn't see it in there either. When I rmmod and modprobe'd it, there was a message which doesn't appear to be logged. It was something about value still set. When I reproduce, I'll add another comment. I'll recompile with I2C debugging.
Still no I2C debugging as kernel hasn't been recompiled yet... On bootup this is logged on the console: i2c-nforce2 i2c_adapter i2c-0: nForce2 SMBus adapter at 0x5000 i2c_adapter i2c-1: nForce2 SMBus adapter at 0x5500 i2c-dev i2c /dev entries driver asb100 w83l785ts The error message that you thought should appear did not.
Two additional questions: 1* You say that loading the w83l785ts driver at boot time sometimes works, and sometimes doesn't. Does cycling the driver afterwards *always* work, or can it fail as well? 2* Please try to remove the asb100 driver from the list of chip drivers to be loaded at boot time. Does it help the w83l785ts driver? BTW I'd like to see the script you use to load these chip drivers at boot time. Thanks.
1. So far, each subsequent time I've rmmod and modprobe'd w83l785ts has succeeded. 2. I'll comment out asb100 from /etc/modules and see how the next few reboots go. 3. I use debian, so the modules that are defined in /etc/modules are automatically loaded through module-init-tools provided by the module-init-tools deb package. #!/bin/sh -e # Silently exit if the kernel does not support modules or needs modutils. [ -f /proc/modules ] || exit 0 [ ! -f /proc/ksyms ] || exit 0 [ -x /sbin/depmod ] || exit 0 PATH="/sbin:/bin" KVER=$(uname -r) if [ -w /lib/modules/$KVER/ ]; then echo -n "Calculating module dependencies... " depmod --quick echo "done." else echo "Not running depmod because /lib/modules/$KVER/ is not writeable." fi if [ -e /etc/modules-$KVER ]; then MODULES_FILE=/etc/modules-$KVER elif [ -e /etc/modules-2.6 ]; then MODULES_FILE=/etc/modules-2.6 else MODULES_FILE=/etc/modules fi # Loop over every line in /etc/modules. echo 'Loading modules...' grep '^[^#]' $MODULES_FILE | \ while read module args; do [ "$module" ] || continue echo " $module" modprobe $module $args || true done echo "All modules loaded." # Just in case a sysadmin prefers generic symbolic links in # /lib/modules/boot for boot time modules we will load these modules. boot="$(modprobe --list --type boot)" for d in $boot; do mod="${d##*/}" mod="${mod%.ko}" modprobe "$mod" done exit 0
Hi, I've got it system booting with just loading the w83l785ts module and didn't load the asb100. Prior to this time, it had always worked in creating the sysfs for the w83l785ts device. This time, no sensors were detected. rmmod w83l785ts then modprobe w83l785ts got it working. Afterwards, in dmesg: w83l785ts 1-002e: Updating w83l785ts data. Offtopic, modprobing asb100 creates the sysfs entries, but also gives this in dmesg: asb100.o: detect failed, bad chip id 0x5c! but the asb100 driver did load and I am able to read temperatures off it.
Created attachment 3446 [details] Be more verbose about register reads
Please apply the proposed patch and recompile the driver. Don't forget to enable "I2C Chip debugging messages". The w83l785ts driver will then tell us every value it reads from any register. Please compare a case where it works with a case where it fails, and report. Hopefully this will show some difference which will help us find out where the problem is. As for the asb100 driver, the error message is not for the ASB100 device but for the W83L785TS-S. The asb100 driver will probe both, succeed silently on the ASB100 and fail (and complain) on the W83L785TS-S, which is quite normal since it is really not an ASB100. I don't think that the driver should write such a message since probing other chips (and failing on them) happens quite frequently in I2C chip drivers. I'll contact the author about this.
Actually the asb100 driver only writes the error message in debug mode, so I guess it's acceptable and I won't bug the author with this.
A seemingly similar problem was reported here: http://archives.andrew.net.au/lm-sensors/msg08755.html Changing the order in which the modules are loaded at boot time seemed to help in this case. However, the fact that this same user sometimes had to cycle the w83l785ts module more than once to have the chip properly detected makes me believe that there are two different causes of failure. Without a detailed log of what is read from the chip when a failure occurs (achieved by the attached patch), I can't say more.
John, any news?
Since I took out the automatic loading of the asb100 driver, the w83l785ts driver has worked fine since. Manually loading the asb100 afterwards poised no problems. I just read your comment about the load order of the modules. It appears I had asb100 loading before w83l785ts when asb100 was enabled. I'll renable it again, but place it after w83l785ts. This may be a workaround.
John or Jean, Is there any update on this bug? Thanks, Nish
I didn't have this problem again after I changed my modules load order, load w83l785ts then asb100 Also, I replaced the motherboard, so I can't do further tests. The I2C driver I'm waiting for has a patch for 2.4, but not one for 2.6 yet (W83792D)
It seems rather clear now that the main cause of the problem is an unexpected interaction between the asb100 and w83l785ts drivers, and not read errors as I first suspected (although comment #7 raises questions). The range of possible addresses for the ASB100 chip has been reduced to a single address in 2.6.11, so the asb100 and w83l785ts drivers no longer have overlapping addresses. This might fix this specific interaction problem, but I still don't understand how exacly the drivers could interact with each other in the first place, so the problem might still be present and could resurface at a later time. Nish, any particular reason why you were interested in this bug?
Jean, No particular reason beyond a strong desire to clean up old bugs as they come up. Also, I don't mind helping resolve bugs when I *can* do something. Cleaning out old bugs helps in this regard, too, as I can't really do anything for those bugs which aren't really bugs. In any case, would you rather I kept the bug open, Jean? Or John, would you be willing to test patches Jean or I might be able to generate to determine the root cause? Thanks, Nish
John, Ignore my previous comment to you, as you clearly stated you no longer have the applicable mobo... Jean, that kind of makes me incline towards closing the bug; if someone else has the problem, I'd rather they open a new one so that it's clear who is able to do the testing. Thanks, Nish
Agreed, there's nothing more we can do at this point in time.