Bug 112811
Summary: | udevd hangs on boot for 40-50 seconds | ||
---|---|---|---|
Product: | Drivers | Reporter: | Stuart Foster (smf-linux) |
Component: | I2C | Assignee: | Jean Delvare (jdelvare) |
Status: | NEEDINFO --- | ||
Severity: | normal | CC: | elrond94, mirh, v10lator |
Priority: | P1 | ||
Hardware: | IA-32 | ||
OS: | Linux | ||
Kernel Version: | Linux 4.5-RC5 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
Bisect .config , Linux 4.5-RC5 dmesg and lspci -v
dmesg output i2c debug turned on dmesg from test i2c-piix4.ko dmesg with i2c-piix4 DEBUG turned on dmesg after soak Binary DMI dump from ASUS SABERTOOTH 990FX R2.0 Binary DMI dump from ASUS M5A97 PRO dmesg with i2c debug, ASUS K53U |
Description
Stuart Foster
2016-02-21 19:02:21 UTC
Created attachment 204361 [details]
dmesg output i2c debug turned on
Here with some additional information, also:
root@Andromeda:~# i2cdetect -l
i2c-0 i2c Radeon i2c bit bus 0x90 I2C adapter
i2c-1 i2c Radeon i2c bit bus 0x91 I2C adapter
i2c-2 i2c Radeon i2c bit bus 0x92 I2C adapter
i2c-3 i2c Radeon i2c bit bus 0x93 I2C adapter
i2c-4 i2c Radeon i2c bit bus 0x94 I2C adapter
i2c-5 i2c Radeon i2c bit bus 0x95 I2C adapter
i2c-6 i2c Radeon i2c bit bus 0x96 I2C adapter
i2c-7 i2c Radeon i2c bit bus 0x97 I2C adapter
i2c-8 smbus SMBus PIIX4 adapter port 0 at 0b00 SMBus adapter
i2c-9 smbus SMBus PIIX4 adapter port 2 at 0b00 SMBus adapter
i2c-10 smbus SMBus PIIX4 adapter port 3 at 0b00 SMBus adapter
i2c-11 smbus SMBus PIIX4 adapter port 4 at 0b00 SMBus adapter
i2c-12 smbus SMBus PIIX4 adapter port 1 at 0b20 SMBus adapter
If I build the i2c_piix4 code into the kernel the boot hangs before udevd is run, which in my opinion eliminates udevd as the source of my problem and further points the problem back to the i2c piix4 fix identified previously.
Thanks.
Thanks for reporting. Which chip is this happening on? $ /sbin/lspci -nn | grep SMBus should tell. Sorry, you can ignore the previous comment, just found the lspci in the first attachment. The bisected patch does the right thing, the problem is that it uncovers a problem that was hidden by the bug it fixes. Looking at the logs, it turns out that 2 of the multiplexed SMBus channels do not work properly. You have an i2c device driver loaded (eeprom) which probes 8 I2C addresses on each SMBus port. There is a 2-3 second timeout happening each time the SMBus controller doesn't respond. 2.5 * 2 * 8 = 40 seconds. As a temporary workaround, if you do not load the eeprom module, you should avoid the boot time delay. I'll see if I can find a way to find our which SMBus channels are disabled, so that they aren't registered at all. As a side note, I see you have CONFIG_GPIO_SX150X=y in your configuration file, which you most probably do not need. Thanks for the update, I have taken note of the CONFIG_GPIO_SX150X=y comment and the workaround for the eeprom module should I need it. I am still trying to gain access to a system with a similar SMBus controller. I have modified the driver to check for pin usage. If you feel adventurous, feel tree to give it a try: http://jdelvare.nerim.net/devel/lm-sensors/drivers/i2c-piix4/ Instructions at: http://jdelvare.nerim.net/devel/lm-sensors/drivers/INSTALL Remember it is totally untested code (although I don't expect anything serious to happen besides the driver not working.) This may or may not solve your problem. In both cases I would be interested in the corresponding kernel logs. It is also possible that the pins are configured for SMBus usage (as this is the default for all but SMBus port 4) but the hardware manufacturer left the pins floating because they don't need the extra SMBus ports. If this is the case, my changes won't help, and I'm afraid I don't have any idea how to detect that :( Created attachment 206021 [details]
dmesg from test i2c-piix4.ko
Using the test i2c-piix4.ko resulted in the same level of delay in system start up.
If we cannot detect the ports that are causing the problem at run time could we set boot parameters to ignore specific ports ? :-) My changes didn't work: piix4_smbus 0000:00:14.0: ACPI MMIO space enabled, memory-mapped at 0x00000000 Obviously this address makes no sense. So either my code is wrong or the address is not initialized by your BIOS. But it is odd because the datasheets says it does have a (non-zero) default value. I just noticed that the sp5100_tco driver must deal with the same kind of configuration check for this chipset and the code there is much more complex than what I came up with. I'll investigate. With the current code, as it can't read the pin configuration, the driver falls back to the default pin configuration, which is ports 0, 2, 3 enabled and port 4 disabled. You can indeed see that port 4 is no longer instantiated by the driver, which is probably good as it didn't look like you had anything there. I'm not 100% sure though, as apparently the many errors caused the SMBus controller to go wild and I'm not sure it is still behaving properly. Out of curiosity, are you still able to access the devices on port 0 (e.g. eeprom) after the driver is done with probing the phantom ports? I'm not sure the error recovery paths are working, they are barely tested. Anyway, port 4 is not where the timeouts causing the boot delay were occurring anyway. The delays come from ports 2 and 3 which are still enabled, so no improvement. Indeed if we can't get a reliable way to detect which ports should be enabled or disabled, we may have to disable multiplexing altogether when we have no reason to believe is is configured properly. Or at least disable I2C/SMBus slave device auto-detection by default on the extra ports. And we could add a module parameter to manually override the configuration, as you suggested. But I would prefer if we can get things to just work by default... So I'll try harder. Note: The new log is less verbose because debugging isn't enabled. If you uncomment the following line in the standalone driver's Makefile: #EXTRA_CFLAGS += -DDEBUG you'll get more information from the failures. From the Bolton RRG datasheet: SCL0/GPIO43 is configured as SCL0 when (PMIO_2E) Smbus0Sel[1:0] are set to 00b; otherwise, it is configured as GPIO43 SDA0/GPIO47 is configured as SDA0 when (PMIO_2E) Smbus0Sel[1:0] are set to 00b; otherwise, it is configured as GPIO47 And same for all other SMBus ports. So at least on the Bolton there is no way to detect which pins were intended for SMBus usage and which weren't. Don't know if that holds for the SB800 as well, or Mullins/Carrizo either. Created attachment 206631 [details]
dmesg with i2c-piix4 DEBUG turned on
This is the dmesg with i2c-piix4 DEBUG turned on (using linux-4.5-rc6). This build does in fact boot quicker (but if we are dealing with floating ports who knows what the current states are). The machine has been turned off till just now so I will let it soak for a while and test again. I did check the eeprom contents via the /sys entries and they do appear to be working on this build.
Good, I w Good, I was worried that maybe the repeated failures would confuse the controller to a point where it would no longer work even on the buses which actually exist. Created attachment 206641 [details]
dmesg after soak
Retesting after ~1 hour shows the boot delay is back to ~50 seconds. Looks like floating ports are present ?
Jean In response to your e-mail: "Hi Stuart, I am travelling and unable to update my standalone driver repository at this time. Could you please give a try to the attached driver and report the address reported for ACPI MMIO space, hopefully no longer 0x00000000? I stole the code from the sp5100_tco driver. Thanks," I tested the revised 2c-piix4.c code and the reseult was as follows: [ 5.254583] piix4_smbus 0000:00:14.0: ACPI MMIO space enabled, memory-mapped at 0xfed80000 Good, at least reading the address is fixed. Did it change anything regarding which ports are instantiated? Do you still get ports 0, 2, 3 and 1 instantiated and port 4 skipped? Yes: i2cdetect -l i2c-0 i2c Radeon i2c bit bus 0x90 I2C adapter i2c-1 i2c Radeon i2c bit bus 0x91 I2C adapter i2c-2 i2c Radeon i2c bit bus 0x92 I2C adapter i2c-3 i2c Radeon i2c bit bus 0x93 I2C adapter i2c-4 i2c Radeon i2c bit bus 0x94 I2C adapter i2c-5 i2c Radeon i2c bit bus 0x95 I2C adapter i2c-6 i2c Radeon i2c bit bus 0x96 I2C adapter i2c-7 i2c Radeon i2c bit bus 0x97 I2C adapter i2c-8 smbus SMBus PIIX4 adapter port 0 at 0b00 SMBus adapter i2c-9 smbus SMBus PIIX4 adapter port 2 at 0b00 SMBus adapter i2c-10 smbus SMBus PIIX4 adapter port 3 at 0b00 SMBus adapter i2c-11 smbus SMBus PIIX4 adapter port 1 at 0b20 SMBus adapter Jean, It's been a while, have you had any further thoughts on this issue ? Thanks Stuart *** Bug 118811 has been marked as a duplicate of this bug. *** Created attachment 221661 [details] Binary DMI dump from ASUS SABERTOOTH 990FX R2.0 (In reply to Jean Delvare from bug 118811 comment #19) > The current list of issues with my views on them was published on the > linux-i2c list some times ago: > http://marc.info/?l=linux-i2c&m=145744643613736&w=2 "Yet another possibility is a DMI-based list of boards with known mappings. Not sure what is best, maybe we need a mix of all that, opinions welcome." I guess a list could work. "please provide the DMI data for your board, in case we need it later." Attached is the output of dmidecode --dump-bin - hope it helps. Created attachment 221691 [details]
Binary DMI dump from ASUS M5A97 PRO
If it will help herewith the DMI dump for my M5A97 PRO.
Is there any further work being carried out to resolve this issue ? Think the bug I have is the same as you: https://bugzilla.kernel.org/show_bug.cgi?id=192121 I used the work around in comment 4 to solve my startup delay, your issue may be different: "As a temporary workaround, if you do not load the eeprom module, you should avoid the boot time delay." *** Bug 200713 has been marked as a duplicate of this bug. *** Created attachment 278001 [details]
dmesg with i2c debug, ASUS K53U
And here's mine with every I2C_DEBUG thing enabled.
It's already a miracle I could boot to get it, after plenty of restarts.
I tested 4.19.133 (latest attempt had been with 4.19.0) and everything seems alright. In fact piix4 mentions don't even pop up anymore. After quite a lot of time spent bisecting (put aside work happened to port selection logic in the i2c-piix4 driver too) the commits that really made a difference were the backports of 4abb951b73ff0a8a979113ef185651aa3c8da19b and 50ee7529ec4500c88f8664560770a7a1b65db72b. |