Bug 197167 - tmp102: Initial read is invalid
Summary: tmp102: Initial read is invalid
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Hardware Monitoring (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Guenter Roeck
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-10-09 07:08 UTC by ralf.goebel
Modified: 2017-11-02 13:35 UTC (History)
4 users (show)

See Also:
Kernel Version: 4.9.28
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description ralf.goebel 2017-10-09 07:08:52 UTC
I'm trying to switch from Kernel 4.4.12 to 4.9.28 and seeing an immediate shutdown because of overtemperature:

[    1.942609] i2c /dev entries driver
[    1.949093] thermal thermal_zone5: critical temperature reached(87 C),shutting down
[    1.956863] tmp102 0-0048: initialized
[    1.960667] reboot: Failed to start orderly shutdown: forcing the issue
[    1.967445] Emergency Sync complete
[    1.972549] reboot: Power down

This problem occurs only after cold-start of the system, not after reboot.

It think the problem is caused by initialization, when switching the device from 12-bit to 13-bit mode. If the first read happens before the next conversion cycle, the temperature register still has a 12-bit value. The conversion to Celsius is then twice as high.

I worked around this problem by evaluating the LSB of the temperature register: a one indicates that value has 13 bits. I use this to right-shift the value accordingly:

static inline int tmp102_reg_to_mC(s16 val)
{
	return ((val & ~0x01) * 1000) >> (8 - (val & 0x01));
}
Comment 1 Jean Delvare 2017-10-23 11:30:57 UTC
Commit 3d8f7a89a197 ("hwmon: (tmp102) Improve handling of initial read delay") would be my prime suspect. Guenter?
Comment 2 ralf.goebel 2017-10-23 12:24:09 UTC
My workaround using bit 0 of the temperature register doesn't seem to work. Bit 0 turns to 1 as soon as the 13-bit mode is activated, even if the contents are still 12 bits.
Comment 3 Guenter Roeck 2017-10-23 22:52:18 UTC
Yes, I guess it is 3d8f7a89a197. That commit sets the wait period only if the chip was shut down, but it appears that it is also needed if the resolution changes. I'll check the datasheet tonight to confirm and submit a patch.
Comment 4 Jean Delvare 2017-11-02 13:35:15 UTC
Fixed by

commit d0725439354a58f2b13b9f5234420641b662b9c4
Author: Guenter Roeck
Date:   Mon Oct 23 17:36:03 2017 -0700

    hwmon: (tmp102) Fix first temperature reading

Note You need to log in before you can comment on or make changes to this bug.