Bug 39252 - [r8169] PPPoE connections don't work if a custom MAC address is assigned
Summary: [r8169] PPPoE connections don't work if a custom MAC address is assigned
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Francois Romieu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-12 22:51 UTC by Artem S. Tashkinov
Modified: 2011-08-08 15:04 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg (45.84 KB, text/plain)
2011-07-14 16:30 UTC, Artem S. Tashkinov
Details
Pending r8169 changes from davem-next.r8169 (20.84 KB, patch)
2011-07-14 23:12 UTC, Francois Romieu
Details | Diff
Fix sticky accepts packet bits in RxConfig (1.63 KB, patch)
2011-07-19 15:46 UTC, Francois Romieu
Details | Diff
MAC address change fix for the 8168e-vl (1.89 KB, patch)
2011-08-02 13:56 UTC, Francois Romieu
Details | Diff

Description Artem S. Tashkinov 2011-07-12 22:51:11 UTC
Description of problem: if I assign a custom MAC address to my onboard NIC, then I cannot establish PPPoE connections, and even `pppoe -A` command doesn't return any PPPoE access concentrators.


Version-Release number of selected component (if applicable): r8169 2.3LK-NAPI


How reproducible: always


Steps to Reproduce:
1. Boot
2. ifconfig eth0 hw ether MACADDRESS
3. Try to establish PPPoE connection using eth0
  
Actual results: PPPoE connection cannot be established, no network packets return


Expected results: PPPoE connections working


Additional info: if I put eth0 in the promiscuous mode, then PPPoE starts working.

Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)
        Subsystem: ASUSTeK Computer Inc. Device 8432
        Flags: bus master, fast devsel, latency 0, IRQ 47
        I/O ports at e000 [size=256]
        Memory at d0004000 (64-bit, prefetchable) [size=4K]
        Memory at d0000000 (64-bit, prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 01
        Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
        Capabilities: [d0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Virtual Channel
        Capabilities: [160] Device Serial Number 00-00-00-00-00-00-00-00
        Kernel driver in use: r8169
        Kernel modules: r8169
Comment 1 Artem S. Tashkinov 2011-07-12 22:58:35 UTC
After a PPPoE session has been established, running `ifconfig eth0 -promisc` (disabling a workaround for this patch) results in a PPPoE connection failure.
Comment 2 Artem S. Tashkinov 2011-07-12 23:06:12 UTC
(In reply to comment #1)
> After a PPPoE session has been established, running `ifconfig eth0 -promisc`
> (disabling a workaround for this patch) results in a PPPoE connection
> failure.

s/patch/bug/
Comment 3 Artem S. Tashkinov 2011-07-13 22:05:18 UTC
Is there any information I can dump/collect in order that this bug gets fixed ASAP?
Comment 4 Andrew Morton 2011-07-13 23:13:52 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 12 Jul 2011 22:51:16 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=39252
> 
>            Summary: [r8169] PPPoE connections don't work if a custom MAC
>                     address is assigned
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 3.0
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Network
>         AssignedTo: drivers_network@kernel-bugs.osdl.org
>         ReportedBy: t.artem@mailcity.com
>         Regression: No
> 
> 
> Description of problem: if I assign a custom MAC address to my onboard NIC,
> then I cannot establish PPPoE connections, and even `pppoe -A` command
> doesn't
> return any PPPoE access concentrators.
> 
> 
> Version-Release number of selected component (if applicable): r8169
> 2.3LK-NAPI
> 
> 
> How reproducible: always
> 
> 
> Steps to Reproduce:
> 1. Boot
> 2. ifconfig eth0 hw ether MACADDRESS
> 3. Try to establish PPPoE connection using eth0
> 
> Actual results: PPPoE connection cannot be established, no network packets
> return
> 
> 
> Expected results: PPPoE connections working
> 
> 
> Additional info: if I put eth0 in the promiscuous mode, then PPPoE starts
> working.
> 
> Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI
> Express
> Gigabit Ethernet controller (rev 06)
>         Subsystem: ASUSTeK Computer Inc. Device 8432
>         Flags: bus master, fast devsel, latency 0, IRQ 47
>         I/O ports at e000 [size=256]
>         Memory at d0004000 (64-bit, prefetchable) [size=4K]
>         Memory at d0000000 (64-bit, prefetchable) [size=16K]
>         Capabilities: [40] Power Management version 3
>         Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
>         Capabilities: [70] Express Endpoint, MSI 01
>         Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
>         Capabilities: [d0] Vital Product Data
>         Capabilities: [100] Advanced Error Reporting
>         Capabilities: [140] Virtual Channel
>         Capabilities: [160] Device Serial Number 00-00-00-00-00-00-00-00
>         Kernel driver in use: r8169
>         Kernel modules: r8169
>
Comment 5 Francois Romieu 2011-07-14 07:40:17 UTC
(In reply to comment #3)
> Is there any information I can dump/collect in order that this bug gets fixed
> ASAP?

1. linux kernel version
2. complete dmesg from boot including the XID line from the r8169 driver
   -> it helps identifying the chipset. Depending on the number of 816x
      chipsets and kernel version, it could highlight a device detection
      problem (now fixed in -rc)
3. ethtool -d eth0
   -> supposedly the same information as the XID
4. motherboard model
   -> your report so far suggests a non-regression, new Asus motherboard issue.
      If this is a new, poorly supported device (8168e or 8168f something ?)
      issue, we may see more of those in a short timeframe.
5. please see patches at :
   http://git.kernel.org/?p=linux/kernel/git/romieu/netdev-2.6.git;a=shortlog;h=refs/heads/davem-next.r8169

-- 
Ueimor
Comment 6 Artem S. Tashkinov 2011-07-14 16:30:46 UTC
Created attachment 65662 [details]
dmesg

(In reply to comment #5)
> (In reply to comment #3)
> > Is there any information I can dump/collect in order that this bug gets
> fixed
> > ASAP?
> 
> 1. linux kernel version

Linux 3.0-rc7 vanilla

> 2. complete dmesg from boot including the XID line from the r8169 driver
>    -> it helps identifying the chipset. Depending on the number of 816x
>       chipsets and kernel version, it could highlight a device detection
>       problem (now fixed in -rc)

Here it is.

> 3. ethtool -d eth0
>    -> supposedly the same information as the XID

# ethtool -d eth0
Unknown RealTek chip (mask: 0x2c800000)

(dmesg says XID 0c900800)

> 4. motherboard model
>    -> your report so far suggests a non-regression, new Asus motherboard
>    issue.
>       If this is a new, poorly supported device (8168e or 8168f something ?)
>       issue, we may see more of those in a short timeframe.

ASUS P8H61-M LE

> 5. please see patches at :
>   
>
> http://git.kernel.org/?p=linux/kernel/git/romieu/netdev-2.6.git;a=shortlog;h=refs/heads/davem-next.r8169

Can these patches be applied safely on top of 3.0-rc7 without jumping onto -next bandwagon?

> 
> -- 
> Ueimor
Comment 7 Francois Romieu 2011-07-14 23:10:05 UTC
(In reply to comment #6)
[...]
> # ethtool -d eth0
> Unknown RealTek chip (mask: 0x2c800000)
> 
> (dmesg says XID 0c900800)

This is a RTL8168evl. You need the patches in davem-next.r8169.

[...]
> >
> http://git.kernel.org/?p=linux/kernel/git/romieu/netdev-2.6.git;a=shortlog;h=refs/heads/davem-next.r8169
> 
> Can these patches be applied safely on top of 3.0-rc7 without jumping onto
> -next bandwagon?

Yes, the patches will apply to current -rc.

-- 
Ueimor
Comment 8 Francois Romieu 2011-07-14 23:12:00 UTC
Created attachment 65682 [details]
Pending r8169 changes from davem-next.r8169

The patch supports rtl8169evl.
Comment 9 Artem S. Tashkinov 2011-07-15 05:19:00 UTC
(In reply to comment #8)
> Created an attachment (id=65682) [details]
> Pending r8169 changes from davem-next.r8169
> 
> The patch supports rtl8169evl.

I've applied these patches but they didn't solve the bug. The issue persists.
Comment 10 David S. Miller 2011-07-18 18:50:32 UTC
From: Andrew Morton <akpm@linux-foundation.org>
Date: Wed, 13 Jul 2011 16:13:45 -0700

>> https://bugzilla.kernel.org/show_bug.cgi?id=39252
>> 
>>            Summary: [r8169] PPPoE connections don't work if a custom MAC
>>                     address is assigned
 ...
>> Description of problem: if I assign a custom MAC address to my onboard NIC,
>> then I cannot establish PPPoE connections, and even `pppoe -A` command
>> doesn't
>> return any PPPoE access concentrators.

Since you seem to be creating your PPPoE connections _after_ changing
the MAC, the following shouldn't matter, but for the cases where
PPPoE connections already exist we do need this kind of change.

Again, I don't expect this to fix the bug, and I believe that it's
some r8169 specific issue.  Although, it might.

--------------------
pppoe: Must flush connections when MAC address changes too.

Kernel bugzilla: 39252

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/pppoe.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 718879b..bc9a4bb 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -348,8 +348,9 @@ static int pppoe_device_event(struct notifier_block *this,
 
 	/* Only look at sockets that are using this specific device. */
 	switch (event) {
+	case NETDEV_CHANGEADDR:
 	case NETDEV_CHANGEMTU:
-		/* A change in mtu is a bad thing, requiring
+		/* A change in mtu or address is a bad thing, requiring
 		 * LCP re-negotiation.
 		 */
Comment 11 Artem S. Tashkinov 2011-07-18 19:14:11 UTC
# lsmod | grep pppoe
<empty>

I'm not using pppoe kernel module, so your patch won't help me.
Comment 12 Francois Romieu 2011-07-18 21:11:44 UTC
(In reply to comment #9)
[...]
> I've applied these patches but they didn't solve the bug. The issue persists.

Does 'ethtool -S eth0' say anything special ? All zeroes ?

Realtek's firmware may deserve a try. See:

http://git.kernel.org/?p=linux/kernel/git/dwmw2/linux-firmware.git;a=commit;h=7293f35d80afc5c614c4d4c9758907e46a0f0c0e

(ethtool -i eth0 will tell you if it is loaded or not).

Otherwise you can try Realtek's own r8168 driver.

I'll try to reproduce the problem here anyway.

-- 
Ueimor
Comment 13 Artem S. Tashkinov 2011-07-18 21:34:06 UTC
(In reply to comment #12)
> (In reply to comment #9)
> [...]
> > I've applied these patches but they didn't solve the bug. The issue
> persists.
> 
> Does 'ethtool -S eth0' say anything special ? All zeroes ?

# ethtool -S eth0
NIC statistics:
     tx_packets: 1305178
     rx_packets: 2084928
     tx_errors: 0
     rx_errors: 0
     rx_missed: 0
     align_errors: 0
     tx_single_collisions: 0
     tx_multi_collisions: 0
     unicast: 1994006
     broadcast: 14955
     multicast: 90922
     tx_aborted: 0
     tx_underrun: 0

> 
> Realtek's firmware may deserve a try. See:
> 
>
> http://git.kernel.org/?p=linux/kernel/git/dwmw2/linux-firmware.git;a=commit;h=7293f35d80afc5c614c4d4c9758907e46a0f0c0e
> 
> (ethtool -i eth0 will tell you if it is loaded or not).
> 

I'm already using it (though there's zero difference):

# ethtool -i eth0
driver: r8169
version: 2.3LK-NAPI
firmware-version: rtl_nic/rtl8168e-3.fw
bus-info: 0000:04:00.0

> Otherwise you can try Realtek's own r8168 driver.
> 

Realtek's own driver doesn't work the very same way ;) And after using r8168 driver, Linux native r8169 driver no longer works (it loads, but no data comes or leaves this NIC).

> I'll try to reproduce the problem here anyway.

Thank you.
Comment 14 Francois Romieu 2011-07-19 10:32:35 UTC
(In reply to comment #13)
[...]
> I'll try to reproduce the problem here anyway.

It was the easy part.

The NIC (8168evl) sends frames with the new mac address but it still receives
through the old mac address.

-- 
Ueimor
Comment 15 Francois Romieu 2011-07-19 13:30:40 UTC
(In reply to comment #14)
[...]
> The NIC (8168evl) sends frames with the new mac address but it still receives
> through the old mac address.

And as soon as the promiscuous mode has been enabled, it won't be disabled
even when tried to and the kernel claims otherwise : the chipset does not
care and it now receives everything - even non-matching hw address.

-- 
Ueimor
Comment 16 Francois Romieu 2011-07-19 15:44:08 UTC
(In reply to comment #15)
[...]
> And as soon as the promiscuous mode has been enabled, it won't be disabled
> even when tried to and the kernel claims otherwise : the chipset does not
> care and it now receives everything - even non-matching hw address.

Gaahh...

e542a2269f232d61270ceddd42b73a4348dee2bb never resets the accept packets
bit in RxConfig. It explains the sticky behavior of the promiscuous mode.

The attached patch on top of davem-next fixes the bug upon which your
workaround operates : if you apply it, you will need to keep the interface
in promiscuous mode to work (at least until your problem is solved).

I wonder if your problem is related to the ASF capabilities of this NIC.
Assuming nobody beats me, I'll see on thursday if the mac address can be
forced through the eeprom (assuming there is one :o/ ). There is some code
for it in Realtek's own r8168 driver.

-- 
Ueimor
Comment 17 Francois Romieu 2011-07-19 15:46:00 UTC
Created attachment 66112 [details]
Fix sticky accepts packet bits in RxConfig
Comment 18 Artem S. Tashkinov 2011-07-19 19:30:23 UTC
(In reply to comment #16)

I'm sorry but I no longer have this motherboard/NIC so whatever fix you deem proper for solving this bug will be it - I won't be able to test your patches :(

I'm not even sure if we have to leave this bug report open. ;)

So, do what you think is appropriate in this situation.
Comment 19 Francois Romieu 2011-08-02 13:56:50 UTC
Created attachment 67352 [details]
MAC address change fix for the 8168e-vl

Tested on top of current Linus's -git with my external 8168e-vl NIC.
Comment 20 Florian Mickler 2011-08-08 08:22:17 UTC
A patch referencing this bug report has been merged in Linux v3.1-rc1:

commit c28aa38567101bad4e020f4392df41d0bf6c165c
Author: françois romieu <romieu@fr.zoreil.com>
Date:   Tue Aug 2 03:53:43 2011 +0000

    r8169 : MAC address change fix for the 8168e-vl.
Comment 21 Florian Mickler 2011-08-08 15:04:49 UTC
A patch referencing this bug report has been merged in Linux v3.0:

commit 680ba7ca630f5816af9c80a946520be76b2167a5
Author: David S. Miller <davem@davemloft.net>
Date:   Mon Jul 18 11:48:28 2011 -0700

    pppoe: Must flush connections when MAC address changes too.

Note You need to log in before you can comment on or make changes to this bug.