Bug 2899 - w83l785ts doesn't create structures in sysfs
Summary: w83l785ts doesn't create structures in sysfs
Status: REJECTED INSUFFICIENT_DATA
Alias: None
Product: Drivers
Classification: Unclassified
Component: Hardware Monitoring (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Jean Delvare
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-06-16 14:01 UTC by John Wong
Modified: 2005-08-08 12:20 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.7
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Be more verbose about register reads (767 bytes, patch)
2004-07-31 07:01 UTC, Jean Delvare
Details | Diff

Description John Wong 2004-06-16 14:01:25 UTC
Distribution: Debian sid
Hardware Environment: AMD Athlon on Asus A7N8X-DX
Software Environment: kernel 2.6.7.  I recall it happening on 2.6.6 and possibly
2.6.5.
Problem Description: The w83l785ts loads on bootup, however, no entries are
created in sysfs so I can't find out the temperature using the diode of the
processor.  It doesn't always not create the sysfs entries.  I can fix it by
rmmod'ing w83l785ts and modprobing it again.

Steps to reproduce: Boot/reboot system
Comment 1 John Wong 2004-06-21 10:11:52 UTC
gambit:~> lsmod | grep w83l785ts                            [10:07am/06-21-04]
w83l785ts               7556  0 
i2c_sensor              3200  2 w83l785ts,asb100
i2c_core               24212  5 w83l785ts,asb100,i2c_sensor,i2c_dev,i2c_nforce2 

gambit:/sys/bus/i2c/devices> ls -l                          [10:08am/06-21-04]
total 0
lrwxrwxrwx    1 root     root            0 Jun 21 00:37 1-002d ->
../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-002d/
lrwxrwxrwx    1 root     root            0 Jun 21 00:37 1-0048 ->
../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-0048/
lrwxrwxrwx    1 root     root            0 Jun 21 00:37 1-0049 ->
../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-0049/


gambit:/sys/bus/i2c/devices# rmmod w83l785ts
gambit:/sys/bus/i2c/devices# modprobe w83l785ts
gambit:/sys/bus/i2c/devices# ls -l
total 0
lrwxrwxrwx    1 root     root            0 Jun 21 00:37 1-002d ->
../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-002d
lrwxrwxrwx    1 root     root            0 Jun 21 10:09 1-002e ->
../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-002e
lrwxrwxrwx    1 root     root            0 Jun 21 00:37 1-0048 ->
../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-0048
lrwxrwxrwx    1 root     root            0 Jun 21 00:37 1-0049 ->
../../../devices/pci0000:00/0000:00:01.1/i2c-1/1-0049

gambit:~> lsmod | grep w83l785ts                            [10:10am/06-21-04]
w83l785ts               7556  0 
i2c_sensor              3200  2 w83l785ts,asb100
i2c_core               24212  5 w83l785ts,asb100,i2c_sensor,i2c_dev,i2c_nforce2 
Comment 2 Jean Delvare 2004-06-29 11:59:02 UTC
Several Asus boards are known to mess with the SMBus and W83L785TS-S and cause
read errors on the latter. A retry-on-error mechanism has been implemented in
the w83l785ts driver in 2.6.3, which was reported to give good results, but
obviously not for you.

If read errors happend at the moment the w83l785ts driver is attempting to
identify your chip (basically while the driver is loading), it will not
recognize it and, as a result, you don't see any files under sysfs. Since the
errors don't happen all the time, reloading the driver later eventually works.

If my analysis is correct, you should see "w83l785ts: Couldn't read value from
register. Pease report." messages in system logs when the problem occurs. Can
you please confirm?

It would also help if you could recompile your kernel with "Device Drivers > I2C
Support > I2C Chip debugging messages" enabled (if that's not already the case)
so that we can do some stats on the errors and have a chance to improve the
mechanism.
Comment 3 John Wong 2004-06-29 19:38:20 UTC
I don't remember seeing "w83l785ts: Couldn't read value from
register. Pease report."  I double checked my logs, and didn't see it in there
either.

When I rmmod and modprobe'd it, there was a message which doesn't appear to be
logged.  It was something about value still set.  When I reproduce, I'll add
another comment.

I'll recompile with I2C debugging.
Comment 4 John Wong 2004-07-01 01:44:33 UTC
Still no I2C debugging as kernel hasn't been recompiled yet...

On bootup this is logged on the console:

    i2c-nforce2
i2c_adapter i2c-0: nForce2 SMBus adapter at 0x5000
i2c_adapter i2c-1: nForce2 SMBus adapter at 0x5500
    i2c-dev
i2c /dev entries driver
    asb100
    w83l785ts

The error message that you thought should appear did not.
Comment 5 Jean Delvare 2004-07-09 04:37:38 UTC
Two additional questions:

1* You say that loading the w83l785ts driver at boot time sometimes works, and
sometimes doesn't. Does cycling the driver afterwards *always* work, or can it
fail as well?

2* Please try to remove the asb100 driver from the list of chip drivers to be
loaded at boot time. Does it help the w83l785ts driver?

BTW I'd like to see the script you use to load these chip drivers at boot time.

Thanks.
Comment 6 John Wong 2004-07-09 08:23:56 UTC
1. So far, each subsequent time I've rmmod and modprobe'd w83l785ts has succeeded.

2. I'll comment out asb100 from /etc/modules and see how the next few reboots go.

3. I use debian, so the modules that are defined in /etc/modules are
automatically loaded through module-init-tools provided by the module-init-tools
deb package.

#!/bin/sh -e

# Silently exit if the kernel does not support modules or needs modutils.
[ -f /proc/modules ] || exit 0
[ ! -f /proc/ksyms ] || exit 0
[ -x /sbin/depmod ] || exit 0

PATH="/sbin:/bin"

KVER=$(uname -r)
if [ -w /lib/modules/$KVER/ ]; then
  echo -n "Calculating module dependencies... "
  depmod --quick
  echo "done."
else
  echo "Not running depmod because /lib/modules/$KVER/ is not writeable."
fi

if [ -e /etc/modules-$KVER ]; then
  MODULES_FILE=/etc/modules-$KVER
elif [ -e /etc/modules-2.6 ]; then
  MODULES_FILE=/etc/modules-2.6
else
  MODULES_FILE=/etc/modules
fi

# Loop over every line in /etc/modules.
echo 'Loading modules...'
grep '^[^#]' $MODULES_FILE | \
while read module args; do
  [ "$module" ] || continue
  echo "    $module"
  modprobe $module $args || true
done
echo "All modules loaded."

# Just in case a sysadmin prefers generic symbolic links in
# /lib/modules/boot for boot time modules we will load these modules.
boot="$(modprobe --list --type boot)"
for d in $boot; do
    mod="${d##*/}"
    mod="${mod%.ko}"
    modprobe "$mod"
done
exit 0
Comment 7 John Wong 2004-07-31 00:48:30 UTC
Hi,

I've got it system booting with just loading the w83l785ts module and didn't
load the asb100.  Prior to this time, it had always worked in creating the sysfs
for the w83l785ts device.  This time, no sensors were detected.

rmmod w83l785ts
then modprobe w83l785ts got it working.

Afterwards, in dmesg:
w83l785ts 1-002e: Updating w83l785ts data.

Offtopic, modprobing asb100 creates the sysfs entries, but also gives this in dmesg:
asb100.o: detect failed, bad chip id 0x5c!

but the asb100 driver did load and I am able to read temperatures off it.
Comment 8 Jean Delvare 2004-07-31 07:01:38 UTC
Created attachment 3446 [details]
Be more verbose about register reads
Comment 9 Jean Delvare 2004-07-31 07:10:19 UTC
Please apply the proposed patch and recompile the driver. Don't forget to enable
"I2C Chip debugging messages". The w83l785ts driver will then tell us every
value it reads from any register. Please compare a case where it works with a
case where it fails, and report. Hopefully this will show some difference which
will help us find out where the problem is.

As for the asb100 driver, the error message is not for the ASB100 device but for
the W83L785TS-S. The asb100 driver will probe both, succeed silently on the
ASB100 and fail (and complain) on the W83L785TS-S, which is quite normal since
it is really not an ASB100. I don't think that the driver should write such a
message since probing other chips (and failing on them) happens quite frequently
in I2C chip drivers. I'll contact the author about this.
Comment 10 Jean Delvare 2004-07-31 07:17:50 UTC
Actually the asb100 driver only writes the error message in debug mode, so I
guess it's acceptable and I won't bug the author with this.
Comment 11 Jean Delvare 2004-09-01 05:50:24 UTC
A seemingly similar problem was reported here:
http://archives.andrew.net.au/lm-sensors/msg08755.html

Changing the order in which the modules are loaded at boot time seemed to help
in this case.

However, the fact that this same user sometimes had to cycle the w83l785ts
module more than once to have the chip properly detected makes me believe that
there are two different causes of failure.

Without a detailed log of what is read from the chip when a failure occurs
(achieved by the attached patch), I can't say more.
Comment 12 Jean Delvare 2004-11-20 03:07:13 UTC
John, any news?
Comment 13 John Wong 2004-11-20 13:13:15 UTC
Since I took out the automatic loading of the asb100 driver, the w83l785ts
driver has worked fine since.  Manually loading the asb100 afterwards poised no
problems.

I just read your comment about the load order of the modules.  It appears I had
asb100 loading before w83l785ts when asb100 was enabled.  I'll renable it again,
but place it after w83l785ts.  This may be a workaround.
Comment 14 Nishanth Aravamudan 2005-03-02 22:53:31 UTC
John or Jean,

Is there any update on this bug?

Thanks,
Nish
Comment 15 John Wong 2005-03-03 00:03:28 UTC
I didn't have this problem again after I changed my modules load order, load
w83l785ts then asb100

Also, I replaced the motherboard, so I can't do further tests.  The I2C driver
I'm waiting for has a patch for 2.4, but not one for 2.6 yet (W83792D)
Comment 16 Jean Delvare 2005-03-03 02:20:19 UTC
It seems rather clear now that the main cause of the problem is an unexpected
interaction between the asb100 and w83l785ts drivers, and not read errors as I
first suspected (although comment #7 raises questions).

The range of possible addresses for the ASB100 chip has been reduced to a single
address in 2.6.11, so the asb100 and w83l785ts drivers no longer have
overlapping addresses. This might fix this specific interaction problem, but I
still don't understand how exacly the drivers could interact with each other in
the first place, so the problem might still be present and could resurface at a
later time.

Nish, any particular reason why you were interested in this bug?
Comment 17 Nishanth Aravamudan 2005-03-03 09:29:27 UTC
Jean,

No particular reason beyond a strong desire to clean up old bugs as they come
up. Also, I don't mind helping resolve bugs when I *can* do something. Cleaning
out old bugs helps in this regard, too, as I can't really do anything for those
bugs which aren't really bugs.

In any case, would you rather I kept the bug open, Jean? Or John, would you be
willing to test patches Jean or I might be able to generate to determine the
root cause?

Thanks,
Nish
Comment 18 Nishanth Aravamudan 2005-03-03 09:30:49 UTC
John,

Ignore my previous comment to you, as you clearly stated you no longer have the
applicable mobo... Jean, that kind of makes me incline towards closing the bug;
if someone else has the problem, I'd rather they open a new one so that it's
clear who is able to do the testing.

Thanks,
Nish
Comment 19 Jean Delvare 2005-03-03 10:22:24 UTC
Agreed, there's nothing more we can do at this point in time.

Note You need to log in before you can comment on or make changes to this bug.