Bug 1188

Summary: 3com NIC fails to initialise unless acpi=off - T21, T22
Product: Drivers Reporter: Yaroslav Rastrigin (yarick)
Component: PCIAssignee: Luming Yu (luming.yu)
Status: RESOLVED CODE_FIX    
Severity: normal CC: acpi-bugzilla, bugzilla.kernel.org, gj, Jon.Kibler, kernel.bugzilla-eran, patl, ralf, stefan
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.4.19 - .22, 2.5.65 - 2.6.0-test4 Subsystem:
Regression: --- Bisected commit-id:
Attachments: dmesg output
Try to verify if PCI config space could be messed up.
patch to collect operations to pci config space using configuration type 1
/proc/ioports from 2.6.6-mm3 on ThinkPad T21
acpidmp >/tmp/acpidmp.output for the linux-2.6.6-mm3 on a ThinkPad T21
iomem of Thinkpad T20 2.6.6
lspci of Thinkpad T20 2.6.6
dmsg of Thinkpad T20 2.6.6
syslog of Thinkpad T20 2.6.6
Log summary Thinkpad T20 2.6.6
dmesg from IBM A21P, driver failure
dmesg from IBM A21P, drivers works ok with acpi=off

Description Yaroslav Rastrigin 2003-09-05 11:36:06 UTC
Distribution:
ALT Linux Master 2.2
Hardware Environment:
IBM ThinkPad T21 Model no. 2647-4BG BIOS release date 29/03/2003 
Software environment:
ACPI revisions 20030519 - 20030813
Problem Description:
If ACPI ("Full ACPI support" in recent kernels) is enabled, MiniPCI NIC fails to 
initialise, can not find MII transceiver(s) and ceases to function.
Some additional steps are required to sometimes make it function properly.
Steps to reproduce:
Build kernel with Full ACPI support, and enable 3Com 59x (Vortex/Boomerang) NIC 
support (doesn't matters if it is built as module or in-kernel). Reboot into new
kernel, dmesg output:
PCI: Enabling device 0000:00:03.0 (0000 -> 0003)
PCI: Found IRQ 11 for device 0000:00:03.0
PCI: Sharing IRQ 11 with 0000:00:03.1
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
0000:00:03.0: 3Com PCI 3c556B Laptop Hurricane at 0x1800. Vers LK1.1.19
PCI: Setting latency timer of device 0000:00:03.0 to 64
  ***WARNING*** No MII transceivers found!

Script that sometimes (I can't tell exactly why sometimes it works and sometimes 
do not) helps me to initialise it properly:
setpci -v -H 1 -s 00:03.00 COMMAND=0x07 CACHE_LINE_SIZE=0 LATENCY_TIMER=0x40
BASE_ADDRESS_0=0x1801 BASE_ADDRESS_1=0xE8101400 BASE_ADDRESS_2=0xE8101000
Comment 1 Yaroslav Rastrigin 2003-09-05 11:39:54 UTC
Created attachment 823 [details]
dmesg output 

Near the tail of this dmesg you could see first attempt at
modprobe 3c59x
fails, then the following one-liner is executed:
setpci -v -H 1 -s 00:03.00 COMMAND=0x07 CACHE_LINE_SIZE=0 LATENCY_TIMER=0x40
BASE_ADDRESS_0=0x1801 BASE_ADDRESS_1=0xE8101400 BASE_ADDRESS_2=0xE8101000
and second modprobe succeeds.
Comment 2 Luming Yu 2003-09-08 19:23:08 UTC
I have some comments
1.) why do your T21 have 3c556 NIC. (My T21 has Intel(R) PRO/100+MiniPCI)
2.) If you suspect it is due to ACPI, Would you please post /proc/interrupts, 
lspci -vv, dmesg (ACPI is fully enabled with debug option enabled). acpidmp.
3.) My T21, T23 and T40 don't have NIC problem.

Thanks a lot!
Comment 3 Yaroslav Rastrigin 2003-09-09 03:33:49 UTC
1. http://www-3.ibm.com/pc/support/site.wss/quickPath.do?quickPathEntry=26474bg
2647-4BG support
Product description
PIII 800MHz (256KB), 128MB RAM, 20.0GB HDD, 14.1 XGA(1024x768) TFT LCD, 8x-2.3x 
DVD, 3Com combo, TV out, Li-Ion Battery, W98
3Com combo (NIC/Modem) is a feature of this particular model.
2. Requested outputs are at
http://www.relex.ru/~yarick/acpi
lspci.out == lspci -vv -xxx 
lspci1.out == lspci -vv -H1
lspci2.out == lspci -vv
3. Glad for you :-)
Comment 4 Luming Yu 2003-09-12 08:03:59 UTC
Could you please do below test:
1.) download acpi-20030730-2.6.0-test2.diff.gz
2.) download acpi-20030714-2.6.0-test1.diff.gz
3.) cd root directory of your 2.6-test4 tree.
4.) patch -Rp1 < ../acpi-20030730-2.6.0-test2.diff
5.) patch -RP1 < ../acpi-20030714-2.6.0-test1.diff

Now, you could have a 2.6.0-test4 tree with ACPI version 20030619.
Then you can try this kernel, and see what could happen.

At least, It help me with resolving PCMCIA net card issues on T23.
I believe this could be a regression.
Thanks a lot.
Comment 5 Yaroslav Rastrigin 2003-09-12 09:26:31 UTC
I want to clarify some things, and, maybe, get some instructions.
1. My kernel is 2.6.0-test5. 
2. This case is not a regression, at least, not in the specified patch-frame. 
I'm tracking this issue since 20030513 ACPI release, when I have tried to use 
ACPI for the first time.
3. This combo is a MiniPCI card, not a CardBus/PCMCIA. 
Also, I don't have any PCMCIA cards handy.

I'll download 2.6.0-test4 and revert 200307xx ACPI patches, to try it, but I 
have tried 20030619 release (not on 2.6.0-test4 kernel, of course) and it 
wasn't really different  - card wasn't working  without additional manual 
initialisation (setpci ...). 
Right now I'm pretty sure DSDT for this model is broken , and no amount of 
tinkering with ACPI subsystem could help. The best someone could do is to 
strictly check common bus initialisation sequence and to skip invalid data. 
Ok, I'm downloading 2.6.0-test4 now. 
Comment 6 Luming Yu 2003-09-16 04:54:07 UTC
I found "00:03.0 Ethernet controller: 3Com Corporation 3c556B Hurricane CardBus
(rev 20)".  Maybe the name "CardBus " cause me turn to PCMCIA. ( I think it
should be "Ethernet Controller", if it is not a PCMCIA net card.)

Thanks for posting lspci, dmesg, /proc/interrupts for failure case. 

Would you please post those information for successfuly case you mentioned in
bug Description. Thanks a lot.
Comment 7 Luming Yu 2003-09-17 00:48:50 UTC
Would you please have patch at bug 1186 a try? thanks a lot.
Comment 8 Yaroslav Rastrigin 2003-09-17 05:33:57 UTC
Ok. I've tried bugfix for 1186 and it wasn't helpful. 
Requested outputs are at 
http://www.relex.ru/~yarick/acpi
with stages from 1 (straight after boot, no attempts to modprobe 3c59x ) to 4
(successfull modprobe 3c59x)
Step 1: right after boot 
Step 2: after unsuccessfull modprobe 3c59x debug=6
Step 3: after setpci ...
Step 4: successfull modprobe 3c59x debug=6
lspci outputs marked .H1. are from lspci -vv -H1 

I haven't tried latest ACPI revision (20030916) yet. Should I ?

Thanks for your patience and helpfulness.
Comment 9 Luming Yu 2003-09-18 06:02:28 UTC
Created attachment 907 [details]
Try to verify if PCI config space could be messed up. 

To demonstrate your net card can work well without ACPI, Could you please
attach dmesg ?

I suspect that PCI config space for 00:03.00 could be messed up.  There are two
places in ACPI, that could call acpi_os_write_pci_configuration which will call
raw_pci_ops->write . One is acpi_hw_low_level_write , another is
acpi_ex_pci_config_space_handler. I suspect acpi_ex_pci_config_space_handler
should treat pci bus id like acpi_hw_low_level_write. So I made this patch. 

Thanks a lot!
Comment 10 Yaroslav Rastrigin 2003-09-19 09:58:30 UTC
Well, I have tested your patch - no success so far. 
Dmesg with acpi=off and successfull loading of NIC driver at 
http://www.relex.ru/~yarick/acpi/dmesg.noacpi
Could you help me understand what these DSDT snippets do ?
       Device (LNKC)
        {
         .......
         Method (_CRS, 0, NotSerialized)
            {
                Name (BUFC, ResourceTemplate ()
                {
                    IRQ (Level, ActiveLow, Shared) {}
                })
                CreateWordField (BUFC, 0x01, IRC1)
                And (\_SB.PCI0.ISA.PIRC, 0x8F, Local0)
                If (VPIR (Local0))
                {
                    Store (ShiftLeft (One, Local0), IRC1)
                }

                Return (BUFC)
            }

            Method (_SRS, 1, NotSerialized)
            {
                CreateWordField (Arg0, 0x01, IRC2)
                FindSetRightBit (IRC2, Local0)
                And (\_SB.PCI0.ISA.PIRC, 0x70, Local1)
                Or (Local1, Decrement (Local0), Local1)
                Store (Local1, \_SB.PCI0.ISA.PIRC)
            }
        }
       .....
     }
Comment 11 Luming Yu 2003-09-24 02:05:55 UTC
  _CRS is a standard device configuration control method that returns current
resource settings. There should be some error message,if _CRS return unmatched
IRQ setting with selected IRQ in acpi_pci_link_set.
  _SRS is also a standard device configuration control method that do actual
setting.
  _CRS is used to verify whehter _SRS is successfull.

  In this case, they seems to be ok. Because I didn't find error message for
them. To verify that, you can test kernel with ACPI debug option on.

  I guess there could be something wrong on operation to PCI config space of
00:03.0. Since that device can work without ACPI, could you please insert 

 printk("pci_conf1_write: seg=%04x, bus=%04x, devfn=%04x, reg=%04x,len=%04x
value=%04x\n", seg, bus, devfn, reg, len, value);

into pci_conf1_write or other similar functions to monitor it. 
 I just want to get the difference of pci config space operations between kernel
with ACPI and kernel without ACPI, that could indicate where is the bug.
 Would you please post them for analysis? Thanks a lot!

Comment 12 Luming Yu 2003-09-28 04:00:05 UTC
Created attachment 945 [details]
patch to collect operations to pci config space using configuration type 1

This patch will monitor each write operation to pci config space using
configuration type 1. Would you please try it, and post dmesg for kernel with
acpi enabled and  acpi disabled. Actually through this method, we can narrow
down the problem. Thanks a lot.
Comment 13 Yaroslav Rastrigin 2003-10-01 03:18:25 UTC
Hi !
Sorry for being silent for so long - I'm rather busy now. 
I've inserted proposed printk into pci_conf1_write, but I'm unable to capture 
full dmesg output with it - it is clipped somewhere in the middle. I've tried to 
increase kernel message buffer size up to 1 Mb - it didn't help. Looks like bug 
somewhere in printk.c. I will definitely try to fix it (or, probably, I've 
messed up somewhere by myself), and will post results. 
Comment 14 Stefan Traby 2003-10-29 12:00:37 UTC
same problem here on Thinkpad A21p.
I guess that all Thinpads are broken - the bug shows only up on 3c556B
cards. Since most Thinkpads have EEpro100 the bug does not show up
there.
Output of lspci, dmesg and /proc/acpi/dsdt
can be found on http://www.hello-penguin.com/thinkpad/
Comment 15 Len Brown 2003-10-29 22:05:49 UTC
Not immediately obvious from the 3com failure message that this is 
an interrupt issue.  But the fact that it is trying to share IRQ9 with ACPI 
makes that possibility worth checking out... 
 
I didnt' notice the /proce/interrupts from the non-acpi case -- could 
you boot with acpi=off and attach them, showing interrupts on eth0? 
 
Yes, 3com is device 3, uses LNKC, and LNKC defaults to IRQ9, 
which is also used by ACPI.   No reason in theory that this shouldn't work. 
 
Currently we don't move active PIC interrupts around, though it is becoming 
clear that in some cases we should.  Just for grins, please revert this patch 
(ie. patch -Rp1 <tmp.patch) to see if 
a.  it causes us to move the eth0 interrupt so it is not shared with ACPI 
b. if so, does that helps things. 
 
http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/test/2.4.22/20030930004558-acpi_pci_link_allocate.patch 
 
thanks, 
-Len 
  
Comment 16 Stefan Traby 2003-10-30 06:20:58 UTC
your patch does not change interrupt assignment, sorry:
http://www.hello-penguin.com/thinkpad/proc_interrupts_2.6.0-pre9-acpi
but it is true that there are no interrupts. interrupt count stays at 40 all
the time. (9:         40          XT-PIC  acpi, eth0)
Note that it's not only an interrupt problem: memory-addressing seems also
nonworking because MAC is FF:FF:FF:FF:FF:FF and all return codes are FFFF.
I guess that the device is simply in some sleeping state or uninitialized -
nothing seems to work except accessing the PCI configuration area for that device.
Is the PCI-config space really ok?
Diffing lspci -vvv (acpi=off against acpi) shows MemWINV+/MemWINV-
so the memory regions are marked virtual in the acpi case.
Comment 17 Len Brown 2004-04-12 21:35:42 UTC
still a problem win 2.4.26 or 2.6.5? 
Comment 18 Yaroslav Rastrigin 2004-04-13 04:06:30 UTC
2.6.5, yes. 2.4.26 haven't tried yet, waiting for the final release. The 
behaviour is the same - the board doesn't initialises properly upon cold 
reboot. It is gets initialised properly, however, after STR/resume cycle.  
Comment 19 Len Brown 2004-04-23 20:49:39 UTC
can you attach the 2.6.5 dmesg, lspci -v and /proc/interrupts? 
Can you verify that IRQ9 functions properly as an ACPI interrupt 
by (stop acpid if it is running), cat /proc/acpi/event and press 
the power button a few times and see if events come out -- 
should also see /proc/interrupts increment IRQ9 once for 
each button press. 
 
no need to test 2.4.26 since we've got 2.6.5. 
 
thanks, 
-Len 
 
 
Comment 20 Len Brown 2004-05-17 21:26:26 UTC
Still an issue with 2.6.6? 
 
If yes, can you attach /proc/ioports and the output from acpidmp 
available in /usr/sbin/, or in pmtools: 
http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/ 
 
Comment 21 Yaroslav Rastrigin 2004-05-19 11:48:51 UTC
Created attachment 2916 [details]
/proc/ioports from 2.6.6-mm3 on ThinkPad T21
Comment 22 Yaroslav Rastrigin 2004-05-19 11:50:12 UTC
Created attachment 2917 [details]
acpidmp >/tmp/acpidmp.output for the linux-2.6.6-mm3 on a ThinkPad T21
Comment 23 Yaroslav Rastrigin 2004-05-19 11:51:09 UTC
Yes, the bug is still there on 2.6.6-mm3 
 
Comment 24 Shaohua 2004-05-28 00:42:54 UTC
IRQ looks ok, and ioports no conflict. Could you please attach the dmesg from 
latest kernel, and attach /proc/iomem. I want to check it. Thanks.
Comment 25 Daniel Poelzleithner 2004-06-15 04:23:37 UTC
Created attachment 3173 [details]
iomem of Thinkpad T20 2.6.6
Comment 26 Daniel Poelzleithner 2004-06-15 04:24:53 UTC
Created attachment 3174 [details]
lspci of Thinkpad T20 2.6.6
Comment 27 Daniel Poelzleithner 2004-06-15 04:27:58 UTC
Created attachment 3175 [details]
dmsg of Thinkpad T20 2.6.6
Comment 28 Daniel Poelzleithner 2004-06-15 04:29:24 UTC
Created attachment 3176 [details]
syslog of Thinkpad T20 2.6.6
Comment 29 Daniel Poelzleithner 2004-06-15 04:46:15 UTC
Comment on attachment 3174 [details]
lspci of Thinkpad T20 2.6.6

sorry
Comment 30 Daniel Poelzleithner 2004-06-15 04:55:36 UTC
Created attachment 3177 [details]
Log summary Thinkpad T20 2.6.6

Look at the difference between lspci.pre_driver_loaded and post_driver_loaded.
The Adapter dies when the driver is loaded. iomem shows much more used areas
without acpi.
Comment 31 Mark Bainter 2004-10-14 19:29:46 UTC
FYI I'm still seeing this bug on 2.6.8.1 on my IBM T-22 laptop.  If none of the
rest of these guys are still available for testing I'd be happy to perform any
testing you need.
Comment 32 Len Brown 2004-11-03 19:17:31 UTC
*** Bug 2595 has been marked as a duplicate of this bug. ***
Comment 33 ralf 2005-01-02 08:58:57 UTC
Created attachment 4325 [details]
dmesg from IBM A21P, driver failure
Comment 34 ralf 2005-01-02 09:00:20 UTC
Created attachment 4326 [details]
dmesg from IBM A21P, drivers works ok with acpi=off
Comment 35 ralf 2005-01-02 09:09:17 UTC
An instance of something that could be the same or at least a very similar
problem.  On my IBM A21P the 3x59x driver in all kernels I've ever tried won't
work with ACPI enabled but pretty well with ACPI disabled.  The sympthom are the
following messages on initialization of the driver.

3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
0000:00:03.0: 3Com PCI 3c556B Laptop Hurricane at 0x1400. Vers LK1.1.19
PCI: Setting latency timer of device 0000:00:03.0 to 64
*** EEPROM MAC address is invalid.
3c59x: vortex_probe1 fails.  Returns -22
3c59x: probe of 0000:00:03.0 failed with error -22

This seems to be independant of the BIOS version also.  I tried V1.02 which was
installed when I bought the machine and now V1.11, the latest from summer 2004.
It's just that the problem got more annoying with latest BIOS because with it
Linux will default to enable ACPI ...
Comment 36 Len Brown 2005-01-03 22:12:40 UTC
*** Bug 3046 has been marked as a duplicate of this bug. ***
Comment 37 Len Brown 2005-01-03 22:14:11 UTC
Has this device ever worked for any 2.4 or 2.6 ACPI-enabled kernel?
Comment 38 Yaroslav Rastrigin 2005-01-03 23:30:41 UTC
In Thinkpads - no, AFAIR.  
The problem is not in the 3c556 per se, but in the Thinkpad's BIOS/DSDT.  
Right now the same card (3Com NIC/Modem combo) is in my HP Omnibook 6K , and 
works flawlessly with ACPI, with 2.6.9 
 
Comment 39 Jaime 2005-04-23 02:59:26 UTC
Hi.

I notice there's been no movement on this for a few months. Does anyone know
whether it's possible to workaround this problem by incorporating a
customized/"repaired" dsdt into the kernel? (documented, for example, at
http://forums.gentoo.org/viewtopic.php?t=122145)

Jaime
Comment 40 Luming Yu 2005-04-26 21:50:55 UTC
Please try kernel option pci=noacpi 
Comment 41 Jaime 2005-04-29 01:31:32 UTC
Luming Yu,

When I use "pci=noacpi" the 3Com NIC doesn't work. (It *does* when I use
"acpi=off"). Post #38 seems to suggest that it's more an error in the dsdt than
in the acpi code - do you know whether a custom dsdt might give me working acpi
*and* a working 3C556B?

Thank you, Jaime
Comment 42 kernel.bugzilla-eran 2005-06-19 16:51:08 UTC
Here's the RedHat Bugzilla bug about this issue:
  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=158725
No solution there, but a few more data points.

Same problem on my ThinkPad T21 with 2.6.12 and 2.6.11.x.
Comment 43 Shaohua 2005-07-27 23:21:15 UTC
Could you please test the patch  from John W. Linville at 
(http://marc.theaimsgroup.com/?l=linux-kernel&m=112247477326714&w=2).
I think it will fix your bug. Thanks!
Comment 44 ralf 2005-08-01 06:40:58 UTC
John Linville's patch works for me with the 3Com PCI 3c556B Laptop Hurricane in
an IBM A21p laptop, so as far as I'm concerned this bug can be closed.
Comment 45 Shaohua 2005-08-01 18:32:34 UTC
Great to know a two-year-old bug to get fixed.