Bug 54861 - r6040 net-driver on new kernels not work(3.X.X)
Summary: r6040 net-driver on new kernels not work(3.X.X)
Status: RESOLVED CODE_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Florian Fainelli
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-03-05 16:01 UTC by Yaroslav
Modified: 2017-08-08 22:14 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.6.8 (3.7.8), possible all new versions
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel config files (2.6.37.2 , 3.6.8, 3.7.8) (26.58 KB, application/octet-stream)
2013-03-05 16:01 UTC, Yaroslav
Details
bootlog for kernel 3.6.8 (like dmesg) (14.89 KB, application/octet-stream)
2013-03-06 06:54 UTC, Yaroslav
Details
dmesg for kernel 3.6.8 (15.46 KB, application/octet-stream)
2013-03-06 06:55 UTC, Yaroslav
Details
ethtool results (kernel 3.6.8) (2.05 KB, application/octet-stream)
2013-03-06 06:55 UTC, Yaroslav
Details
ethtool results (kernel 2.6.37.2) (1.55 KB, application/octet-stream)
2013-03-06 07:10 UTC, Yaroslav
Details
debug driver source (3.6.8 kernel) & output result (16.64 KB, application/octet-stream)
2013-03-11 15:28 UTC, Yaroslav
Details

Description Yaroslav 2013-03-05 16:01:51 UTC
Created attachment 94581 [details]
kernel config files (2.6.37.2 , 3.6.8, 3.7.8)

Hello.

I have a problem.

I have a motherboard with processor RDC8610. As PHY-chip used tlk100 (Texas Instruments). tlk100 uses "Generic PHY-driver" from kernel.

When I compile a kernel 2.6.37.2 works fine.

When I compile a kernel 3.6.8 or 3.7.8 I have problems in the network.
The built-in network card (r6040) does not work pings.
ifconfig shows no errors, but the "RX bytes"> 0 while the "TX bytes" = 0! It seems that the transmitter is not initialized or not working.

Rarely get messages on the console:

eth1: link UP
eth1: link DOWN

As I see it, the value "TX bytes" from ifconfig increased (but not significantly)


Network cable is connected and not damaged.

I can see that from the kernel version 2.6.37.2 to 3.6.8, the NIC driver r6040 upgraded from 0.26 (30May2010) to 0.28 (07Oct2011).

I attach the configuration files of the kernel.

What is it? Bug in the new driver r6040?
Or am I doing wrong?
Comment 1 Florian Fainelli 2013-03-05 16:37:48 UTC
Hello,

(In reply to comment #0)
> Created an attachment (id=94581) [details]
> kernel config files (2.6.37.2 , 3.6.8, 3.7.8)
> 
> Hello.
> 
> I have a problem.
> 
> I have a motherboard with processor RDC8610. As PHY-chip used tlk100 (Texas
> Instruments). tlk100 uses "Generic PHY-driver" from kernel.
> 
> When I compile a kernel 2.6.37.2 works fine.
> 
> When I compile a kernel 3.6.8 or 3.7.8 I have problems in the network.
> The built-in network card (r6040) does not work pings.
> ifconfig shows no errors, but the "RX bytes"> 0 while the "TX bytes" = 0! It
> seems that the transmitter is not initialized or not working.
> 
> Rarely get messages on the console:
> 
> eth1: link UP
> eth1: link DOWN
> 

Can you give me the output of ethtool eth1 for your device? Thanks!
Comment 2 Florian Fainelli 2013-03-05 16:42:02 UTC
Having the r6040-related parts of the dmesg would also help a lot, thanks!
Comment 3 Florian Fainelli 2013-03-05 16:55:06 UTC
It looks like we need a proper PHY driver for this one, as it has some extended registers, still we need to figure out what is wrong with the current kernel version.
Comment 4 Yaroslav 2013-03-05 17:08:10 UTC
(In reply to comment #1)
> Hello,
> 
> (In reply to comment #0)
> > Created an attachment (id=94581) [details] [details]
> > kernel config files (2.6.37.2 , 3.6.8, 3.7.8)
> > 
> > Hello.
> > 
> > I have a problem.
> > 
> > I have a motherboard with processor RDC8610. As PHY-chip used tlk100 (Texas
> > Instruments). tlk100 uses "Generic PHY-driver" from kernel.
> > 
> > When I compile a kernel 2.6.37.2 works fine.
> > 
> > When I compile a kernel 3.6.8 or 3.7.8 I have problems in the network.
> > The built-in network card (r6040) does not work pings.
> > ifconfig shows no errors, but the "RX bytes"> 0 while the "TX bytes" = 0!
> It
> > seems that the transmitter is not initialized or not working.
> > 
> > Rarely get messages on the console:
> > 
> > eth1: link UP
> > eth1: link DOWN
> > 
> 
> Can you give me the output of ethtool eth1 for your device? Thanks!

I give log ethtool tomorrow.
Comment 5 Yaroslav 2013-03-05 17:22:39 UTC
I give dmesg output also tomorrow.

-----

On kernel 2.6.37.2 also used "Generic PHY-driver" for tlk100 and there it works.

Yes, tlk100 has more registers than usual, but the specification is written that it is 100% compatible with the standard PHY-chip (standard registers tlk100 located at the same address as an standard MII PHY-chip). Additional registers have different addresses.
Comment 6 Yaroslav 2013-03-06 06:54:19 UTC
Created attachment 94621 [details]
bootlog for kernel 3.6.8 (like dmesg)

bootlog for kernel 3.6.8.
Comment 7 Yaroslav 2013-03-06 06:55:00 UTC
Created attachment 94631 [details]
dmesg for kernel 3.6.8

dmesg for kernel 3.6.8
Comment 8 Yaroslav 2013-03-06 06:55:57 UTC
Created attachment 94641 [details]
ethtool results (kernel 3.6.8)

ethtool results (kernel 3.6.8)
Comment 9 Yaroslav 2013-03-06 07:10:23 UTC
Created attachment 94651 [details]
ethtool results (kernel 2.6.37.2)

ethtool results (kernel 2.6.37.2)
Comment 10 Florian Fainelli 2013-03-06 10:31:54 UTC
Thanks Yaroslav, there are several things that I would like to check with you:

- can you add some prints to r6040_phy_read() to print the register address and value (using 0x%04x as formatter for value)
- can you revert the following commit 06e92c33999fd66128c2256b0461455633c3d53c (r6040: invoke phy_{start,stop} when appropriate) and see if that works better without it?

According to your bootlog-3.6.8, the PHY is being brought down while it should not and I would like to understand why.

Thank you!
Comment 11 Yaroslav 2013-03-06 11:45:03 UTC
Thank you for your comments, they gave me the idea.

If you look at the file "ethtool-2.6.37.2" it is evident that the ethtool can not determine the status of the network adapter -

Speed: Unknown!
Duplex: Unknown! (255)

It is suspicious! When I used ethtool to manually change the Speed ​​& Duplex, I also broke my network for kernel 2.6.37.2. The behavior was the same as the kernel 3.6.8 - rare reports eth1: Link Up / Down and pings are not working.

I think it's more like a hardware problem on the motherboard near the PHY-chip and MII-interface of processor . Perhaps the problem was always there, but more recent driver r6040 more meticulously handles error messages or provide additional initialization, while the old driver ignores the error.

I need more time to test it.
Comment 12 Florian Fainelli 2013-03-06 12:55:53 UTC
The ethtool output you are seeing is what commit 06e92c33999fd66128c2256b0461455633c3d53c is fixing, the PHY state machine was not properly started and did not report consistent values to ethtool.

We could try to write a proper TLK100 PHY driver so that we can configure PHY specific settings if needed.

Is your board something I can buy off-the-shelf? If so, can you provide me with a link to the shop/model?

Thanks!
Comment 13 Yaroslav 2013-03-06 13:31:15 UTC
(In reply to comment #12)
> Is your board something I can buy off-the-shelf? If so, can you provide me
> with
> a link to the shop/model?
> 
> Thanks!

No. My motherboard is designed and made in our company for our embedded systems. This motherboard is sold in our products over the last couple of years (limited production, I think that we sold 100-200 units). I used kernel 2.6.30-2.6.37.2 without problems all the time.

Recently, I wanted to test a new build Linux Buldroot with the new kernel and got this error.

I do not exclude the possibility that our engineers can make a mistake in the design of the motherboard, it is strange that I have found it only two days ago.

I will use your advice - Added reading PHY-registers in the driver r6040 and displaying them on the screen.
Comment 14 Florian Fainelli 2013-03-06 16:37:41 UTC
By adding some prints to r6040_phy_read() you should see whether all bits are properly sampled by the hardware and you can check whether the various register values are consistent with the reset values etc ...

Thanks!
Comment 15 Yaroslav 2013-03-11 15:28:28 UTC
Created attachment 95171 [details]
debug driver source (3.6.8 kernel) & output result
Comment 16 Florian Fainelli 2013-03-11 16:18:23 UTC
There is definitively something wrong with PHY register output, can you make sure this is not a hardware issue (missing pull-up/downs)? I do not think that we can change the r6040 MDIO clock but I will double check against the datasheet.
Comment 17 Florian Fainelli 2013-03-20 09:13:44 UTC
Yaroslav, have you been able to do some hardware analysis on your device? Do you think there is still something to fix in the r6040 driver at this point?
Comment 18 Yaroslav 2013-03-20 12:02:09 UTC
I talked to our engineer. He said that from a hardware point of view, everything is correct (all resistors, pull-up/down).
Wiring diagram tlk100 with support elements (resistors, capacitors, quartz etc.) he took with examples online TI.
I think it may be to do the tests in the loader "redboot". I want to check how the chip switches between - 10/100 half/duplex and see the state of the registers without kernel.
And I'm on leave from 18 March (24 days) and do not have access to the equipment.

Sorry for my bad english :)
Comment 19 Florian Fainelli 2013-03-20 12:58:34 UTC
No problem, we'll resume fixing this once you are back.
Comment 20 jose1711 2017-07-13 12:26:38 UTC
possibly the same issue as in https://bugzilla.kernel.org/show_bug.cgi?id=196347 since you never heard back from reporter please consider closing.

Note You need to log in before you can comment on or make changes to this bug.