Bug 196509

Summary: iTCO_wdt regression reboot before timeout expire
Product: Drivers Reporter: Seb Lu (seblu)
Component: WatchdogAssignee: drivers_watchdog (drivers_watchdog)
Status: NEW ---    
Severity: normal CC: bonzini, grawity, jb, mwilck, thomas, wim
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.12.x Tree: Mainline
Regression: No

Description Seb Lu 2017-07-27 21:37:48 UTC
Since 4.12.0 kernel, iTCO_wdt watchdog driver reboot too soon on my Intel hardwares (i.e Dell M610, Thinkpad X1 carbon, ASUSTeK H81I-PLUS/i3-4160).

With a 4.11.0 my watchdog timeout was set to 30s. About halt way (15s), the watchdog daemon (systemd here) reset the timeout.

With 4.12.[0123], when you set it to 30s, it reboots at 15s. Snap.

Rollback to 4.11 fix the issue. Set the timeout to a higher value, like 60s fix too.
Comment 1 Thomas Bächler 2017-07-28 23:16:11 UTC
I confirm this problem for a Lenovo T440s Laptop and a Desktop with Asus H87-Plus mainboard.

Commit 1fccb73011ea8a5fa0c6d357c33fa29c695139ea seems like an obvious suspect.
Comment 2 Jörg Bornemann 2017-08-13 18:22:41 UTC
I can reproduce exactly this issue on an ASUS UX31A with kernel version 4.12.3 after upgrading from 4.11.9.

The described work-around works for me as well.
Comment 3 Wim Van Sebroeck 2017-10-06 08:36:20 UTC
Bug confirmed. I reverted patch 1fccb73011ea8a5fa0c6d357c33fa29c695139ea.
Comment 4 Martin Wilck 2017-10-06 20:28:30 UTC
Any chance that this fix be submitted to stable/linux-4.12.y, stable/linux-4.13.y ?
Comment 5 Paolo Bonzini 2017-10-12 13:46:05 UTC
Could you please test if the turn_SMI_watchdog_clear_off module option works for you (e.g. turn_SMI_watchdog_clear_off=99)?
Comment 6 Martin Wilck 2017-10-23 12:44:18 UTC
(In reply to Paolo Bonzini from comment #5)

Sorry for the late reply. No, turn_SMI_watchdog_clear_off=99 does not fix the issue on my Dell Latitude E7470.

Kernel with patch from this bug applied reports: 

[   13.593418] iTCO_vendor_support: vendor-support=0
[   13.594447] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
[   13.594577] iTCO_wdt: Found a Intel PCH TCO device (Version=4, TCOBASE=0x0400)
[   13.596893] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
Comment 7 Martin Wilck 2017-10-23 13:00:02 UTC
FTR, applied in stable/linux-4.13.y:

ff04be02de1b watchdog: Revert "iTCO_wdt: all versions count down twice"

Not in stable/linux-4.12.y yet.