Bug 13008
Summary: | Serial not working after resume | ||
---|---|---|---|
Product: | Drivers | Reporter: | Bruno Prémont (bonbons) |
Component: | Serial | Assignee: | Alan (alan) |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | alan, rjw, rui.zhang |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.28 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216, 56331 | ||
Attachments: |
DSDT as in /proc/acpi/dsdt
Kernel log (boot + suspend & resume) Test program to look at registers Basic platform driver/device for F81216 Platform driver/device for f81216, supports serial port suspend/resume Updated patch Updated patch |
Description
Bruno Prémont
2009-04-04 12:17:08 UTC
Created attachment 20799 [details]
Kernel log (boot + suspend & resume)
Note: kernel has vserver patch and backported squashfs-4 patches from 2.6.29 applied. Also included is "Enable PNPACPI _PSx Support, v3" patch mentionned above.
At next opportunity I will check with vanilla 2.6.29 as well as comparing results for ttyS[01] versus ttyS[23]
If there are any debug options to enable or other means of getting useful additional information, please tell me so I can get more out of the 2.6.29 attempts.
Checking on 2.6.29.1 vanilla I get the same results, after first suspend/resume nobody answers anymore on ttyS2. Connecting the UPS to ttyS0 nut can again contact UPS even after additional suspend/resume cycles. So it's only the two extra serial ports that get unusable after suspend. According to mainboard user manual the 2 extra serial ports are driven by a Fintek F81216D chip (Google did find this spec for the chip: http://www.fintek.com.tw/files/productfiles/F81216_V032P.pdf ) Very hard to tell. If the PC is behaving in the usual fashion I'd expect the ports to just work. The fallback case is that ACPI is managing the ports on an ACPI 3 box in which case I'd expect the patch you tried to sort it out. I suppose it is possible that this system relies on some sort of magic windows driver to restore those ports to life, in which case someone will need to write an equivalent Linux driver. How can I compare the chip's configuration before suspend and after resume? I guess isadump should be the right tool though I don't know at what address(es) to direct it to (how can I determine the right address(es)?) [ 0.840514] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 0.840707] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 0.840897] serial8250: ttyS2 at I/O 0x3e8 (irq = 4) is a 16550A [ 0.841092] serial8250: ttyS3 at I/O 0x2e8 (irq = 3) is a 16550A [ 0.841480] 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 0.841752] 00:08: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 0.842031] 00:0a: ttyS2 at I/O 0x3e8 (irq = 11) is a 16550A [ 0.842301] 00:0b: ttyS3 at I/O 0x2e8 (irq = 5) is a 16550A # cat /sys/devices/pnp0/00:0a/resources state = active io 0x3e8-0x3ef irq 11 # cat /sys/devices/pnp0/00:0b/resources state = active io 0x2e8-0x2ef irq 5 I'd start by reading all the config registers in the chip manual and doing the same after suspend/resume. If its not being restored by the BIOS/ACPI you'll need to write some driver code to do the restore That's pretty much what I would like to do though I don't know what address to feed to isadump, nor do I know how to determine it :( If you have a good pointer on how to determine the right address I would appreciate a lot! (as that would also help me to get started with writing GPIO driver for super-io chips on 2 of my systems, especially the one I don't know which pins are used for the GPIO) From the datasheet it looks like the device sits on the LPC bus (thats the low pincount bus that is used for extra oddments like serial/parallel on motherboards) From the bottom of page 16 its control registers at 0x4E/0x4F so inb/outb to 0x4E/0x4F will hit the super I/O chip controls. The smple code on page 21 is I think what you need - except to note that our outb() and their outportb have the arguments backward. Another useful trick is to do this as root from userspace and use ioperm() (see man ioperm) to allow port 0x4e/0x4f access to your process - you can then write test programs in user space and play with those registers. Created attachment 21334 [details]
Test program to look at registers
Using attached program (compiled with gcc -Wall -O2 -o iodump iodump.c) I get the following output:
===== program output =====
Calling ioperm(4e, 2, 1): 0 (Success)
Activating configuration mode.
Global registers:
Software Reset (0x02): 0xff
Logic Device Select (0x07): 0xff
Device ID (0x20, 0x21): 0xffff
Vendor ID (0x23, 0x24): 0xffff
Clock Source Select (0x25): 0xff
Test mode (0x2f): 0xff
UART 1 registers:
Logic Device Select (0x07): 0xff
Device Enable (0x30): 0xff
I/O Port Select (0x60, 0x61): 0xffff
IRQ Channel Select (0x70): 0xff
Clock Select (0xf0): 0xff
IR1 Control (0xf1): 0xff
UART 2 registers:
Logic Device Select (0x07): 0xff
Device Enable (0x30): 0xff
I/O Port Select (0x60, 0x61): 0xffff
IRQ Channel Select (0x70): 0xff
Clock Select (0xf0): 0xff
UART 3 registers:
Logic Device Select (0x07): 0xff
Device Enable (0x30): 0xff
I/O Port Select (0x60, 0x61): 0xffff
IRQ Channel Select (0x70): 0xff
Clock Select (0xf0): 0xff
UART 4 registers:
Logic Device Select (0x07): 0xff
Device Enable (0x30): 0xff
I/O Port Select (0x60, 0x61): 0xffff
IRQ Channel Select (0x70): 0xff
Clock Select (0xf0): 0xff
Leaving configuration mode.
===== end of program output =====
I wonder why I always just get to see ff as a result from inb ... according to Fintek documentation at least a few of the global registers should return fixed values (which are different from 0xff!, e.g. for device and vendor IDs)
Any idea about what I might be doing wrong? (I tried with and without the code block to select configuration mode, also trying the 4 possible values to be used to enter configuration mode).
The chip pins document says for pin 41 that it picks between 0x4E and 0x2E. Did you try 0x2E yet ? Otherwise it all looks complete and correct Ok, got some good output using 0x2e and 0x2f when also going through the configuration mode part (if not doing so output is just ff): Calling ioperm(2e, 2, 1): 0 (Success) Activating configuration mode. Global registers: Software Reset (0x02): 0x0 Logic Device Select (0x07): 0x1 Device ID (0x20, 0x21): 0x28 Vendor ID (0x23, 0x24): 0x1934 Clock Source Select (0x25): 0x1 Test mode (0x2f): 0x0 UART 1 registers: Logic Device Select (0x07): 0x0 Device Enable (0x30): 0x1 I/O Port Select (0x60, 0x61): 0x3e8 IRQ Channel Select (0x70): 0x3b Clock Select (0xf0): 0x0 IR1 Control (0xf1): 0x44 UART 2 registers: Logic Device Select (0x07): 0x1 Device Enable (0x30): 0x1 I/O Port Select (0x60, 0x61): 0x2e8 IRQ Channel Select (0x70): 0x35 Clock Select (0xf0): 0x0 UART 3 registers: Logic Device Select (0x07): 0x2 Device Enable (0x30): 0x0 I/O Port Select (0x60, 0x61): 0x00 IRQ Channel Select (0x70): 0x0 Clock Select (0xf0): 0x0 UART 4 registers: Logic Device Select (0x07): 0x3 Device Enable (0x30): 0x0 I/O Port Select (0x60, 0x61): 0x00 IRQ Channel Select (0x70): 0x0 Clock Select (0xf0): 0x0 Leaving configuration mode. As I just have remote access to the system right now I can't do the suspend check right now though I will in a few hours and report back. In case a quirk is needed to resume the chip, do you have a suggestion as to where to put it? (probably to run on DMI-match, saving settings on suspend and restoring them on resume) In case there is useful information in /proc/ioport, here it is (at least the 0x4e,0x4f and 0x2e,0x2f are not listed anywhere): 0000-001f : dma1 0020-0021 : pic1 0040-0043 : timer0 0050-0053 : timer1 0060-0060 : keyboard 0064-0064 : keyboard 0070-0071 : rtc0 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : 0000:00:14.1 0170-0177 : pata_atiixp 01f0-01f7 : 0000:00:14.1 01f0-01f7 : pata_atiixp 0201-0201 : IT8712F Watchdog 0280-028f : pnp 00:09 02e8-02ef : serial 02f8-02ff : serial 0376-0376 : 0000:00:14.1 0376-0376 : pata_atiixp 03c0-03df : vga+ 03e8-03ef : serial 03f6-03f6 : 0000:00:14.1 03f6-03f6 : pata_atiixp 03f8-03ff : serial 040b-040b : pnp 00:0e 04d0-04d1 : pnp 00:0e 04d6-04d6 : pnp 00:0e 0800-089f : pnp 00:0e 0800-0803 : ACPI PM1a_EVT_BLK 0804-0805 : ACPI PM1a_CNT_BLK 0808-080b : ACPI PM_TMR 0810-0815 : ACPI CPU throttle 0820-0827 : ACPI GPE0_BLK 08ff-08ff : ACPI PM2_CNT_BLK 0900-090f : pnp 00:0e 0910-091f : pnp 00:0e 0a30-0a3f : pnp 00:09 0a60-0a6f : pnp 00:0c 0b00-0b0f : 0000:00:14.0 0b10-0b1f : pnp 00:0e 0c00-0c01 : pnp 00:0e 0c14-0c14 : pnp 00:0e 0c50-0c51 : pnp 00:0e 0c52-0c52 : pnp 00:0e 0c6c-0c6c : pnp 00:0e 0c6f-0c6f : pnp 00:0e 0cd0-0cd1 : pnp 00:0e 0cd2-0cd3 : pnp 00:0e 0cd4-0cd5 : pnp 00:0e 0cd6-0cd7 : pnp 00:0e 0cd8-0cdf : pnp 00:0e 0cf8-0cff : PCI conf1 0e00-0e0f : pnp 00:09 0e80-0e8f : pnp 00:09 0e85-0e86 : it87 0e85-0e86 : it87 8000-800f : 0000:00:12.0 8000-800f : ahci 9000-9003 : 0000:00:12.0 9000-9003 : ahci a000-a007 : 0000:00:12.0 a000-a007 : ahci b000-b003 : 0000:00:12.0 b000-b003 : ahci c000-c007 : 0000:00:12.0 c000-c007 : ahci d000-dfff : PCI Bus 0000:01 d000-d0ff : 0000:01:05.0 e000-efff : PCI Bus 0000:04 fe00-fefe : pnp 00:0e ff00-ff0f : 0000:00:14.1 ff00-ff0f : pata_atiixp And here is the result after a S3 suspend/resume cycle: Calling ioperm(2e, 2, 1): 0 (Success) Activating configuration mode. Global registers: Software Reset (0x02): 0x0 Logic Device Select (0x07): 0x1 Device ID (0x20, 0x21): 0x28 Vendor ID (0x23, 0x24): 0x1934 Clock Source Select (0x25): 0x0 Test mode (0x2f): 0x0 UART 1 registers: Logic Device Select (0x07): 0x0 Device Enable (0x30): 0x1 I/O Port Select (0x60, 0x61): 0x3e8 IRQ Channel Select (0x70): 0x1b Clock Select (0xf0): 0x0 IR1 Control (0xf1): 0x44 UART 2 registers: Logic Device Select (0x07): 0x1 Device Enable (0x30): 0x1 I/O Port Select (0x60, 0x61): 0x2e8 IRQ Channel Select (0x70): 0x15 Clock Select (0xf0): 0x0 UART 3 registers: Logic Device Select (0x07): 0x2 Device Enable (0x30): 0x0 I/O Port Select (0x60, 0x61): 0x00 IRQ Channel Select (0x70): 0x0 Clock Select (0xf0): 0x0 UART 4 registers: Logic Device Select (0x07): 0x3 Device Enable (0x30): 0x0 I/O Port Select (0x60, 0x61): 0x00 IRQ Channel Select (0x70): 0x0 Clock Select (0xf0): 0x0 Leaving configuration mode. The differences towards post-boot state are: - global clock-source select (0x00 instead of 0x01) - UART1, IRQ channel 0x1b instead of 0x3b - UART2, IRQ channel 0x15 instead of 0x35 Setting all of the 3 to their original values gets the serial ports back alive for communication. I will put a patch together on Wednesday or Thursday to get this done by kernel during resume, though any hint as to where to put the quirk would be appreciated! Alan, thanks a lot for your guidance The proper way to do it is probably to register a platform device for the hardware and create a new 8250_xxx module for it. That avoids getting code into the system nobody else needs and means you can register and handle the ports nicely too. Your platform driver can register for suspend/resume events to reprogram the ports, and then suspend/resume the attached ports so it gets the ordering right. Created attachment 21469 [details]
Basic platform driver/device for F81216
Hi Alan,
This is what I got together for now (neither compile nor runtime tested yet, for early feed-back).
I'm having a hard time understanding how the serial port registration happens, especially as ports seem to get registered multiple times (first by 8250.c as ISA device; then by 8250_pnp.c).
How shall I register the ports as dependent on my platform driver/device so the ordering gets right for suspend/resume and it does not result in an interference with the two code paths already registering the ports?
Except blind probing I see no way to detect the presence of the chip; on my system DMI is of no help at all (all useful fields contain "To Be Filled By O.E.M.")...
The port discovery part is done by tables in a platform device. I wouldn't worry about that - if you can get the basic driver you've posted to save/resume the port data and test it then I can wrap the serial driver code around it for you to test further. A lot of systems (particularly the ones we end up working around ;)) have things like OEM in the BIOS. Have a look at the lspci -vvxxx and you should see unique subvendor/subdevice identifiers from other parts of the system that are on the mainboard. If so you can do a PCI match with those instead. I recently got a ChangeLog of newer BIOS revisions for my mainboard and they list the failing serial ports in there. So I updated the BIOS but new revision breaks resume from S3 (confirmed with 2.6.28.9, 2.6.29.4 and 2.6.30-rc7). As such I don't know (yet) is the BIOS update really fixes the serial port configuration. Changelog extract: Editor : Clare Version : Release BIOS Ver V1.7 Date : 2008/01/03 BIOS name : E093MR17.ROM CheckSum : 0A399h Action : 1 . Solve DQV issue 1: Sec IO S3 wake-up COM port lose (Root cause : Core issue) 2: Sytem temperature issue (Root cause : Neglect By BIOS engineer) 2 . Add Keyboard / Mouse (S1/S3) wake-up function 3 . Add RS232/RS422/RS485 BIOS item 4 . Change VGA BIOS solve DVI and CRT detect fail issue So I will probably let this bug languish for a week in the hope of getting S3 working again. If I switch back to the older BIOS (or the newest does not correct the serial ports) I will complete this patch. Created attachment 21784 [details] Platform driver/device for f81216, supports serial port suspend/resume With this instance of the driver (lots of typos fixed since attachment #21469 [details]) I can get the serial ports properly reconfigured during resume. This version of the driver includes debugging output in suspend(), suspend_late(), resume_early() in order to find out what state the chip is in at the various resume periods. From the state dumping at the various suspend/resume I understand that the ACPIPNP resume code is mis-configuring the f81216 when re-enabling the ports. On suspend the 2 UART ports just get disabled, on resume they get re-enabled but their irqchannel register gets changed as well. Working value: 0x3b [00111011] PNP resume value: 0x1b [00011011] From the spec this means: bit 5 = URAIRQ_MODE 0 = PCI IRQ sharing mode 1 = ISA IRQ sharing mode bit 4 = URAIRQ_SHAR 0 = IRQ is not sharing with other deivce 1 = IRQ is sharing with other deivce So after resume IRQ sharing mode is switched from ISA sharing to PCI sharing... This meanse that this code should override/replace the suspend/resume of the matching PNP devices so the end-result is correct. Extract of generated log: ... f81216 f81216: preparing suspend f81216 f81216: suspend f81216 f81216: Chip status: clk=1 f81216 f81216: UART 0 status: ioport=3e8, irqchan=3b, clk=0, enable=1 f81216 f81216: UART 1 status: ioport=2e8, irqchan=3b, clk=0, enable=1 f81216 f81216: UART 2 status: ioport=0, irqchan=0, clk=0, enable=0 f81216 f81216: UART 3 status: ioport=0, irqchan=0, clk=0, enable=0 ... serial 00:0b: legacy suspend serial 00:0b: disabled serial 00:0a: legacy suspend serial 00:0a: disabled system 00:09: legacy suspend serial 00:08: legacy suspend serial 00:08: disabled serial 00:07: legacy suspend serial 00:07: disabled ... f81216 f81216: LATE suspend f81216 f81216: Chip status: clk=1 f81216 f81216: UART 0 status: ioport=3e8, irqchan=3b, clk=0, enable=0 f81216 f81216: UART 1 status: ioport=2e8, irqchan=3b, clk=0, enable=0 f81216 f81216: UART 2 status: ioport=0, irqchan=0, clk=0, enable=0 f81216 f81216: UART 3 status: ioport=0, irqchan=0, clk=0, enable=0 ... f81216 f81216: EARLY resume f81216 f81216: Chip status: clk=1 f81216 f81216: UART 0 status: ioport=3e8, irqchan=3b, clk=0, enable=0 f81216 f81216: UART 1 status: ioport=2e8, irqchan=3b, clk=0, enable=0 f81216 f81216: UART 2 status: ioport=0, irqchan=0, clk=0, enable=0 f81216 f81216: UART 3 status: ioport=0, irqchan=0, clk=0, enable=0 ... serial 00:07: legacy resume serial 00:07: activated serial 00:08: legacy resume serial 00:08: activated system 00:09: legacy resume serial 00:0a: legacy resume serial 00:0a: activated serial 00:0b: legacy resume serial 00:0b: activated ... f81216 f81216: resume f81216 f81216: restoring device 0 config at offset 70 (was 1b, writing 3b) f81216 f81216: restoring device 1 config at offset 70 (was 1b, writing 3b) f81216 f81216: completing resume ... A few notes related to the BIOS versions I checked: - V1.4 (original BIOS when I got the board) I have no tested this driver under that BIOS but it's the BIOS I used to do the initial userspace debugging - V1.8 (partial improvement over V1.4 for serial, e.g. chip clock setting remains correct) This BIOS also works properly in either S1 or S3 and I keep the serial ports working with my driver - V2.1 (no visible difference on serial side against V1.8, but BIOS expects different OS behavior for S3, e.g. it does cold-reboot during resume; S1 works as for V1.8) Ok made a couple of minor tweaks (just to add resource checking) Can you give this a test and if its ok a Signed-off-by: and I'll kick it upstream. Thanks a lot for all the work on this Created attachment 21981 [details]
Updated patch
> platform_device_add_resources(f81216_device, &res, 1);
Does this have any "reservation"/exclusivity impact on that resource?
If so this will clash with hwmon/watchdog on the ITE SuperIO chip on my mainboard.
Both SuperIO chips do share the same base address for configuration (though they do have different magic sequence to enter/leave configuration mode)...
By default both of them are just listening for their magic activation sequence on the same IO port. This makes it especially important to call the exit sequence when done in order to not interfere with a driver for the other chip.
If platform_device_add_resources() grants exclusive access to the resource, then that part will have to be removed...
The suspend and resume methods should probably also call the request_region()/release_region() functions in order to prevent possible clashes with ITE drivers. (I did not check if those do use request_region() - and for now at least hwmon does no suspend/resume, but that might change in the future)
For inclusion it's probably not worth keeping the dev_printk(KERN_DEBUG ...) at all suspend/resume steps (eventually have them tied to a debug module param or some compile-time debug flag).
Those in restore_*() functions are useful as they tell what has been fixed/changed similar to what is done for PCI.
Test results will follow
Tested with ITE87 hwmon and watchdog built in, 8250_f81216 modular.
Works fine either way:
- patch as in attachment #21981 [details]
- modified version of patch, surrounding all superio_enter()/superio_exit()
with request_region()/release_region() and commenting out the
platform_device_add_resources() call.
It might be worth providing configuration of default io_base and config_key at kernel configuration time (useful for built-in case as it avoids need for extra cmdline arguments) If this is ok I will update patch accordingly.
Created attachment 22029 [details]
Updated patch
I gave the patch an additional update:
- added kconfig entries for both module params so the module can be configured
for the system at build time (not requiring to carry a cmdline for built-in
case)
- surrounded superio_enter/supoerio_exit calls in suspend/resume functions with
request_region/release_region
- added signed-off-by
- optional code paths (e.g. just for debugging/monitoring the chip's state at
the various suspend/resume steps) are now surrounded with #ifdef DEBUG
and can be dropped for upstreaming (I don't think it's worth showing during
normal suspend/resume)
I've tested the patch as is for built-in case and tested the one without #ifdef DEBUG in modular case and both work for me.
The only part I'm not sure would work properly is if 8250_pnp is modular, the required ordering between 8250_pnp and this drivers suspend/resume calls may end up being wrong... (this code must run after 8250_pnp for resume, and before it on suspend)
I have not tested this worry-case yet though.
The right way to handle the arguments isn't Kconfig but to do DMI or PCI matches for the machines with the problem. At that point it can be automated and autoloaded. Other bits look sensible. I've in the meantime posted a draft patch to add proper support for shared resources like SuperIO chips so you can wait for the device (basically a request_region that waits if the other holder of the region is also temporary) Otherwise looks good but as you say eventually need wiriing into the 8250 driver. cc dmitry Alan & Bruno, what's the status of this bug? Any plans of pushing the patch upstream? Well, before pushing upstream the patch shall be adapted so it operates with the right synchronization with regard to 8250_pnp (as it's 8250_pnp's resume() which breaks the serial ports) so the fixup must happen after 8250_pnp's resume() has run. Currently the order is just a matter of "luck" as in order of registration (works with both built-in). A quirk to disable 8250_pnp-firmware resume for those two serial ports would probably even be sufficient as the f81216's function configuration is correct before the 8250_pnp's resume() is run (don't know about higher-level UART state). For the machine matching it will be hard (the older BIOSes seem to export no specific identification information at all (DMI is all OEM, PCI has no board-specific sub-IDs), the more recent ones reboot during resume / at least that was the state last time I checked) (In reply to comment #24) > cc dmitry Not sure why, I don't work on serial... re-assign to drivers/serial category. |