Bug 1188
Summary: | 3com NIC fails to initialise unless acpi=off - T21, T22 | ||
---|---|---|---|
Product: | Drivers | Reporter: | Yaroslav Rastrigin (yarick) |
Component: | PCI | Assignee: | Luming Yu (luming.yu) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | acpi-bugzilla, bugzilla.kernel.org, gj, Jon.Kibler, kernel.bugzilla-eran, patl, ralf, stefan |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.4.19 - .22, 2.5.65 - 2.6.0-test4 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
dmesg output
Try to verify if PCI config space could be messed up. patch to collect operations to pci config space using configuration type 1 /proc/ioports from 2.6.6-mm3 on ThinkPad T21 acpidmp >/tmp/acpidmp.output for the linux-2.6.6-mm3 on a ThinkPad T21 iomem of Thinkpad T20 2.6.6 lspci of Thinkpad T20 2.6.6 dmsg of Thinkpad T20 2.6.6 syslog of Thinkpad T20 2.6.6 Log summary Thinkpad T20 2.6.6 dmesg from IBM A21P, driver failure dmesg from IBM A21P, drivers works ok with acpi=off |
Description
Yaroslav Rastrigin
2003-09-05 11:36:06 UTC
Created attachment 823 [details]
dmesg output
Near the tail of this dmesg you could see first attempt at
modprobe 3c59x
fails, then the following one-liner is executed:
setpci -v -H 1 -s 00:03.00 COMMAND=0x07 CACHE_LINE_SIZE=0 LATENCY_TIMER=0x40
BASE_ADDRESS_0=0x1801 BASE_ADDRESS_1=0xE8101400 BASE_ADDRESS_2=0xE8101000
and second modprobe succeeds.
I have some comments 1.) why do your T21 have 3c556 NIC. (My T21 has Intel(R) PRO/100+MiniPCI) 2.) If you suspect it is due to ACPI, Would you please post /proc/interrupts, lspci -vv, dmesg (ACPI is fully enabled with debug option enabled). acpidmp. 3.) My T21, T23 and T40 don't have NIC problem. Thanks a lot! 1. http://www-3.ibm.com/pc/support/site.wss/quickPath.do?quickPathEntry=26474bg 2647-4BG support Product description PIII 800MHz (256KB), 128MB RAM, 20.0GB HDD, 14.1 XGA(1024x768) TFT LCD, 8x-2.3x DVD, 3Com combo, TV out, Li-Ion Battery, W98 3Com combo (NIC/Modem) is a feature of this particular model. 2. Requested outputs are at http://www.relex.ru/~yarick/acpi lspci.out == lspci -vv -xxx lspci1.out == lspci -vv -H1 lspci2.out == lspci -vv 3. Glad for you :-) Could you please do below test: 1.) download acpi-20030730-2.6.0-test2.diff.gz 2.) download acpi-20030714-2.6.0-test1.diff.gz 3.) cd root directory of your 2.6-test4 tree. 4.) patch -Rp1 < ../acpi-20030730-2.6.0-test2.diff 5.) patch -RP1 < ../acpi-20030714-2.6.0-test1.diff Now, you could have a 2.6.0-test4 tree with ACPI version 20030619. Then you can try this kernel, and see what could happen. At least, It help me with resolving PCMCIA net card issues on T23. I believe this could be a regression. Thanks a lot. I want to clarify some things, and, maybe, get some instructions. 1. My kernel is 2.6.0-test5. 2. This case is not a regression, at least, not in the specified patch-frame. I'm tracking this issue since 20030513 ACPI release, when I have tried to use ACPI for the first time. 3. This combo is a MiniPCI card, not a CardBus/PCMCIA. Also, I don't have any PCMCIA cards handy. I'll download 2.6.0-test4 and revert 200307xx ACPI patches, to try it, but I have tried 20030619 release (not on 2.6.0-test4 kernel, of course) and it wasn't really different - card wasn't working without additional manual initialisation (setpci ...). Right now I'm pretty sure DSDT for this model is broken , and no amount of tinkering with ACPI subsystem could help. The best someone could do is to strictly check common bus initialisation sequence and to skip invalid data. Ok, I'm downloading 2.6.0-test4 now. I found "00:03.0 Ethernet controller: 3Com Corporation 3c556B Hurricane CardBus (rev 20)". Maybe the name "CardBus " cause me turn to PCMCIA. ( I think it should be "Ethernet Controller", if it is not a PCMCIA net card.) Thanks for posting lspci, dmesg, /proc/interrupts for failure case. Would you please post those information for successfuly case you mentioned in bug Description. Thanks a lot. Would you please have patch at bug 1186 a try? thanks a lot. Ok. I've tried bugfix for 1186 and it wasn't helpful. Requested outputs are at http://www.relex.ru/~yarick/acpi with stages from 1 (straight after boot, no attempts to modprobe 3c59x ) to 4 (successfull modprobe 3c59x) Step 1: right after boot Step 2: after unsuccessfull modprobe 3c59x debug=6 Step 3: after setpci ... Step 4: successfull modprobe 3c59x debug=6 lspci outputs marked .H1. are from lspci -vv -H1 I haven't tried latest ACPI revision (20030916) yet. Should I ? Thanks for your patience and helpfulness. Created attachment 907 [details]
Try to verify if PCI config space could be messed up.
To demonstrate your net card can work well without ACPI, Could you please
attach dmesg ?
I suspect that PCI config space for 00:03.00 could be messed up. There are two
places in ACPI, that could call acpi_os_write_pci_configuration which will call
raw_pci_ops->write . One is acpi_hw_low_level_write , another is
acpi_ex_pci_config_space_handler. I suspect acpi_ex_pci_config_space_handler
should treat pci bus id like acpi_hw_low_level_write. So I made this patch.
Thanks a lot!
Well, I have tested your patch - no success so far. Dmesg with acpi=off and successfull loading of NIC driver at http://www.relex.ru/~yarick/acpi/dmesg.noacpi Could you help me understand what these DSDT snippets do ? Device (LNKC) { ....... Method (_CRS, 0, NotSerialized) { Name (BUFC, ResourceTemplate () { IRQ (Level, ActiveLow, Shared) {} }) CreateWordField (BUFC, 0x01, IRC1) And (\_SB.PCI0.ISA.PIRC, 0x8F, Local0) If (VPIR (Local0)) { Store (ShiftLeft (One, Local0), IRC1) } Return (BUFC) } Method (_SRS, 1, NotSerialized) { CreateWordField (Arg0, 0x01, IRC2) FindSetRightBit (IRC2, Local0) And (\_SB.PCI0.ISA.PIRC, 0x70, Local1) Or (Local1, Decrement (Local0), Local1) Store (Local1, \_SB.PCI0.ISA.PIRC) } } ..... } _CRS is a standard device configuration control method that returns current resource settings. There should be some error message,if _CRS return unmatched IRQ setting with selected IRQ in acpi_pci_link_set. _SRS is also a standard device configuration control method that do actual setting. _CRS is used to verify whehter _SRS is successfull. In this case, they seems to be ok. Because I didn't find error message for them. To verify that, you can test kernel with ACPI debug option on. I guess there could be something wrong on operation to PCI config space of 00:03.0. Since that device can work without ACPI, could you please insert printk("pci_conf1_write: seg=%04x, bus=%04x, devfn=%04x, reg=%04x,len=%04x value=%04x\n", seg, bus, devfn, reg, len, value); into pci_conf1_write or other similar functions to monitor it. I just want to get the difference of pci config space operations between kernel with ACPI and kernel without ACPI, that could indicate where is the bug. Would you please post them for analysis? Thanks a lot! Created attachment 945 [details]
patch to collect operations to pci config space using configuration type 1
This patch will monitor each write operation to pci config space using
configuration type 1. Would you please try it, and post dmesg for kernel with
acpi enabled and acpi disabled. Actually through this method, we can narrow
down the problem. Thanks a lot.
Hi ! Sorry for being silent for so long - I'm rather busy now. I've inserted proposed printk into pci_conf1_write, but I'm unable to capture full dmesg output with it - it is clipped somewhere in the middle. I've tried to increase kernel message buffer size up to 1 Mb - it didn't help. Looks like bug somewhere in printk.c. I will definitely try to fix it (or, probably, I've messed up somewhere by myself), and will post results. same problem here on Thinkpad A21p. I guess that all Thinpads are broken - the bug shows only up on 3c556B cards. Since most Thinkpads have EEpro100 the bug does not show up there. Output of lspci, dmesg and /proc/acpi/dsdt can be found on http://www.hello-penguin.com/thinkpad/ Not immediately obvious from the 3com failure message that this is an interrupt issue. But the fact that it is trying to share IRQ9 with ACPI makes that possibility worth checking out... I didnt' notice the /proce/interrupts from the non-acpi case -- could you boot with acpi=off and attach them, showing interrupts on eth0? Yes, 3com is device 3, uses LNKC, and LNKC defaults to IRQ9, which is also used by ACPI. No reason in theory that this shouldn't work. Currently we don't move active PIC interrupts around, though it is becoming clear that in some cases we should. Just for grins, please revert this patch (ie. patch -Rp1 <tmp.patch) to see if a. it causes us to move the eth0 interrupt so it is not shared with ACPI b. if so, does that helps things. http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/test/2.4.22/20030930004558-acpi_pci_link_allocate.patch thanks, -Len your patch does not change interrupt assignment, sorry: http://www.hello-penguin.com/thinkpad/proc_interrupts_2.6.0-pre9-acpi but it is true that there are no interrupts. interrupt count stays at 40 all the time. (9: 40 XT-PIC acpi, eth0) Note that it's not only an interrupt problem: memory-addressing seems also nonworking because MAC is FF:FF:FF:FF:FF:FF and all return codes are FFFF. I guess that the device is simply in some sleeping state or uninitialized - nothing seems to work except accessing the PCI configuration area for that device. Is the PCI-config space really ok? Diffing lspci -vvv (acpi=off against acpi) shows MemWINV+/MemWINV- so the memory regions are marked virtual in the acpi case. still a problem win 2.4.26 or 2.6.5? 2.6.5, yes. 2.4.26 haven't tried yet, waiting for the final release. The behaviour is the same - the board doesn't initialises properly upon cold reboot. It is gets initialised properly, however, after STR/resume cycle. can you attach the 2.6.5 dmesg, lspci -v and /proc/interrupts? Can you verify that IRQ9 functions properly as an ACPI interrupt by (stop acpid if it is running), cat /proc/acpi/event and press the power button a few times and see if events come out -- should also see /proc/interrupts increment IRQ9 once for each button press. no need to test 2.4.26 since we've got 2.6.5. thanks, -Len Still an issue with 2.6.6? If yes, can you attach /proc/ioports and the output from acpidmp available in /usr/sbin/, or in pmtools: http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/ Created attachment 2916 [details]
/proc/ioports from 2.6.6-mm3 on ThinkPad T21
Created attachment 2917 [details]
acpidmp >/tmp/acpidmp.output for the linux-2.6.6-mm3 on a ThinkPad T21
Yes, the bug is still there on 2.6.6-mm3 IRQ looks ok, and ioports no conflict. Could you please attach the dmesg from latest kernel, and attach /proc/iomem. I want to check it. Thanks. Created attachment 3173 [details]
iomem of Thinkpad T20 2.6.6
Created attachment 3174 [details]
lspci of Thinkpad T20 2.6.6
Created attachment 3175 [details]
dmsg of Thinkpad T20 2.6.6
Created attachment 3176 [details]
syslog of Thinkpad T20 2.6.6
Comment on attachment 3174 [details]
lspci of Thinkpad T20 2.6.6
sorry
Created attachment 3177 [details]
Log summary Thinkpad T20 2.6.6
Look at the difference between lspci.pre_driver_loaded and post_driver_loaded.
The Adapter dies when the driver is loaded. iomem shows much more used areas
without acpi.
FYI I'm still seeing this bug on 2.6.8.1 on my IBM T-22 laptop. If none of the rest of these guys are still available for testing I'd be happy to perform any testing you need. *** Bug 2595 has been marked as a duplicate of this bug. *** Created attachment 4325 [details]
dmesg from IBM A21P, driver failure
Created attachment 4326 [details]
dmesg from IBM A21P, drivers works ok with acpi=off
An instance of something that could be the same or at least a very similar problem. On my IBM A21P the 3x59x driver in all kernels I've ever tried won't work with ACPI enabled but pretty well with ACPI disabled. The sympthom are the following messages on initialization of the driver. 3c59x: Donald Becker and others. www.scyld.com/network/vortex.html 0000:00:03.0: 3Com PCI 3c556B Laptop Hurricane at 0x1400. Vers LK1.1.19 PCI: Setting latency timer of device 0000:00:03.0 to 64 *** EEPROM MAC address is invalid. 3c59x: vortex_probe1 fails. Returns -22 3c59x: probe of 0000:00:03.0 failed with error -22 This seems to be independant of the BIOS version also. I tried V1.02 which was installed when I bought the machine and now V1.11, the latest from summer 2004. It's just that the problem got more annoying with latest BIOS because with it Linux will default to enable ACPI ... *** Bug 3046 has been marked as a duplicate of this bug. *** Has this device ever worked for any 2.4 or 2.6 ACPI-enabled kernel? In Thinkpads - no, AFAIR. The problem is not in the 3c556 per se, but in the Thinkpad's BIOS/DSDT. Right now the same card (3Com NIC/Modem combo) is in my HP Omnibook 6K , and works flawlessly with ACPI, with 2.6.9 Hi. I notice there's been no movement on this for a few months. Does anyone know whether it's possible to workaround this problem by incorporating a customized/"repaired" dsdt into the kernel? (documented, for example, at http://forums.gentoo.org/viewtopic.php?t=122145) Jaime Please try kernel option pci=noacpi Luming Yu, When I use "pci=noacpi" the 3Com NIC doesn't work. (It *does* when I use "acpi=off"). Post #38 seems to suggest that it's more an error in the dsdt than in the acpi code - do you know whether a custom dsdt might give me working acpi *and* a working 3C556B? Thank you, Jaime Here's the RedHat Bugzilla bug about this issue: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=158725 No solution there, but a few more data points. Same problem on my ThinkPad T21 with 2.6.12 and 2.6.11.x. Could you please test the patch from John W. Linville at (http://marc.theaimsgroup.com/?l=linux-kernel&m=112247477326714&w=2). I think it will fix your bug. Thanks! John Linville's patch works for me with the 3Com PCI 3c556B Laptop Hurricane in an IBM A21p laptop, so as far as I'm concerned this bug can be closed. Great to know a two-year-old bug to get fixed. |