Bug 9606 - sky2: driver-specific VLAN support is broken with "Yukon-EC (0xb6) rev 1"
Summary: sky2: driver-specific VLAN support is broken with "Yukon-EC (0xb6) rev 1"
Status: CLOSED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
: 10693 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-12-19 07:36 UTC by Dmitry Butskoy
Modified: 2012-05-17 15:23 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.23.8
Subsystem:
Regression: No
Bisected commit-id:


Attachments
HW vlan restore on reset (1.49 KB, patch)
2008-05-14 11:05 UTC, Stephen Hemminger
Details | Diff

Description Dmitry Butskoy 2007-12-19 07:36:10 UTC
Distribution:  Fedora 7

Hardware Environment: Builtin network card

Problem Description:

Our "sky2" device is:
> sky2 0000:04:00.0: v1.18 addr 0xff720000 irq 17 Yukon-EC (0xb6) rev 1

It worked fine without any vlan configured on it, but seems not working with vlans.

The correspond "dmesg" fragment is:

> sky2 eth0: rx length error: status 0x1242100 length 292
> printk: 311 messages suppressed.
> sky2 eth0: rx length error: status 0x27a2300 length 634
> printk: 299 messages suppressed.
> sky2 eth0: rx length error: status 0x402100 length 64
> printk: 264 messages suppressed.
> sky2 eth0: rx length error: status 0xba2100 length 186
> printk: 212 messages suppressed.
> sky2 eth0: rx length error: status 0x442500 length 68

This output is generated by drivers/net/sky2.c:sky2_receive() . I've discovered (by inserting printk() to the code and re-compile etc...), that in our case the "count" is equal to "length", thus the decreasing by VLAN_HLEN seems extra. But commenting out of the decreasing code does not help, because the actual vlan bytes in the frame is not stripped. Moreover, sky2->rx_tag is always zero...

The solution for us is the re-compiling of the sky2 driver with SKY2_VLAN_TAG_USED macro unset.

It is possible though, that we have caught a corner case. First, we use several vlan interfaces based on our sky2 device. Second, we change the MAC address of the sky2 (in offline state) and its vlans.

I can perform any additional testing etc. if needed.
Comment 1 Stephen Hemminger 2007-12-27 13:01:43 UTC
Did you build kernel with VLAN support?

Have you configured any vlan's on the device?
Comment 2 Dmitry Butskoy 2007-12-28 07:16:35 UTC
Oh, it is very important questions... :)

Yes, I use the kernel from Fedora 7, which is built with VLAN support.

Yes, I've configured vlans on the device.


The actual commands, which affects the device (according to my own scripts), are (say device "eth0"):


ip link set eth2 name eth0

ip link set up dev eth0
vconfig add eth0 2
ip link set down dev eth0

ip link set up dev eth0
vconfig add eth0 10
ip link set down dev eth0

ip link set up dev eth0
vconfig add eth0 13
ip link set down dev eth0

ip link set address 00:A0:C9:9E:74:05 dev eth0
ip addr add 192.168.0.180/32 dev eth0

ip link set address 00:A0:C9:9E:74:05 dev eth0.2
ip addr add 192.168.0.180/32 dev eth0.2
ip addr add 10.64.0.180/22 broadcast 10.64.3.255 dev eth0.2

ip link set address 00:A0:C9:9E:74:05 dev eth0.10
ip addr add 192.168.0.180/32 dev eth0.10

ip link set address 00:A0:C9:9E:74:05 dev eth0.13
ip addr add 192.168.0.180/32 dev eth0.13

ip link set up dev eth0
ip link set up dev eth0.2
ip link set up dev eth0.10
ip link set up dev eth0.13


then some routes added etc...
Comment 3 Stephen Hemminger 2008-01-25 17:15:18 UTC
The problem is related to setting the mac address. Note: you don't have
to set the hardware address of the vlan devices. They inherit from the parent.

Setting the address gets the device confused about hardware VLAN tag
group and packets are not being stripped of tag, but should be.  Not sure
if it is a sky2 specific or general hw vlan support problem yet.
Comment 4 Dmitry Butskoy 2008-01-28 08:03:24 UTC
> you don't have to set the hardware address of the vlan devices.
> They inherit from the parent.

I know.

But for the same reasons on which the MAC address varies on normal devices, users might want to change it on VLANs too.

It works on non-hardware VLANs already.


My "coner case" is: I want to switch some "router" MAC address between several  Linux routers (to provide some kind of router clustering etc.). It works fine for years, when I used the software VLAN's only, but for hardware it seems problematic...

When I change the MAC of the underlying (the base) device, kernel tries to save the old MAC addresses for the all previously configured VLANs on that device. Hence we have to change the MACs of VLANs individually then. It seems by design.

Whether an option could be implemented to sync MAC change on the all VLANs "at once"?
Comment 5 Stephen Hemminger 2008-05-14 11:05:24 UTC
Created attachment 16144 [details]
HW vlan restore on reset

Patch to restore vlan tagging on reset
Comment 6 Dmitry Butskoy 2008-08-12 09:22:10 UTC
Sorry for the long delay...

The patch fixes the reported issues.

Unfortunately, some another issue still exist. Any application, which writes to the network (i.e. produces an output from this network card), fails after a while. IOW, after some amount of successful writing (both tcp and udp sockets -- tested by "nc") the subsequent write attempt fails.

There are no such symptoms when I recompile sky2 module for use software VLAN's.

It seems that this issue inspired even by one attempt to change MAC (just after the driver loaded)...
Comment 7 Stephen Hemminger 2008-09-16 11:46:40 UTC
*** Bug 10693 has been marked as a duplicate of this bug. ***
Comment 8 Trenton D. Adams 2009-03-10 01:48:25 UTC
I'm seeing this problem in gentoo kernel 2.6.28.  I ran a gunzip -c command on a 30G tar.gz file and piped it over the network.  It completely locked up my computer twice after a very long time of running.  I was running a vmware machine that I was extracting the tar.gz to.

I hope the info below helps.

I'm running on a Macbook second generation "2,1".
tdamac backup # lspci
00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub (rev 03)
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03)
00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03)
00:07.0 Performance counters: Intel Corporation Device 27a3 (rev 03)
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2)
00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA IDE Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 02)
01:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 22)
02:00.0 Network controller: Atheros Communications Inc. AR5418 802.11abgn Wireless PCI Express Adapter (rev 01)
03:03.0 FireWire (IEEE 1394): Agere Systems FW323 (rev 61)

I got some of these.

Mar 10 02:21:53 tdamac sky2 eth1: hung mac 4:59 fifo 0 (197:197)
Mar 10 02:21:53 tdamac sky2 eth1: receiver hang detected
Mar 10 02:21:53 tdamac sky2 eth1: disabling interface
Mar 10 02:21:53 tdamac Attempt to release TCP socket in state 1 ffff880078c476c0
Mar 10 02:21:53 tdamac ------------[ cut here ]------------
Mar 10 02:21:53 tdamac WARNING: at net/core/dst.c:266 dst_release+0x23/0x2c()
Mar 10 02:21:53 tdamac Modules linked in: vmnet vmblock vmci vmmon i915 drm snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd
_seq snd_seq_device ipt_MASQUERADE iptable_nat nf_nat pl2303 usbserial hermes fuse cifs snd_hda_intel snd_pcm snd_timer snd snd_page_all
oc appletouch pcspkr applesmc hwmon led_class input_polldev scsi_wait_scan
Mar 10 02:21:53 tdamac Pid: 9, comm: events/0 Not tainted 2.6.28-gentoo-r2 #10
Mar 10 02:21:53 tdamac Call Trace:
Mar 10 02:21:53 tdamac <IRQ>  [<ffffffff80232dc8>] warn_on_slowpath+0x51/0x6d
Mar 10 02:21:53 tdamac [<ffffffff80233993>] printk+0x4e/0x56
Mar 10 02:21:53 tdamac [<ffffffff80229902>] task_rq_lock+0x40/0x75
Mar 10 02:21:53 tdamac [<ffffffff80601827>] _spin_lock_bh+0x9/0x1f
Mar 10 02:21:53 tdamac [<ffffffff80550cda>] __nf_ct_ext_destroy+0x41/0x57
Mar 10 02:21:53 tdamac [<ffffffff8053a1bc>] dst_release+0x23/0x2c
Mar 10 02:21:53 tdamac [<ffffffff80530921>] skb_release_head_state+0xd/0x8f
Mar 10 02:21:53 tdamac [<ffffffff8053106c>] skb_release_all+0x9/0x12
Mar 10 02:21:53 tdamac [<ffffffff80530859>] __kfree_skb+0x9/0x6f
Mar 10 02:21:53 tdamac [<ffffffff8041653e>] sky2_tx_complete+0xdc/0x158
Mar 10 02:21:53 tdamac [<ffffffff804184be>] sky2_poll+0x8e6/0xb31
Mar 10 02:21:52 tdamac dhcpcd[6899]: eth1: carrier lost
Mar 10 02:21:53 tdamac [<ffffffff805381e2>] net_rx_action+0x9d/0x170
Mar 10 02:21:53 tdamac [<ffffffff8023737c>] __do_softirq+0x7a/0x13d
Mar 10 02:21:53 tdamac [<ffffffff8020c32c>] call_softirq+0x1c/0x28
Mar 10 02:21:53 tdamac <EOI>  [<ffffffff8020d198>] do_softirq+0x2c/0x68
Mar 10 02:21:53 tdamac [<ffffffff802377db>] local_bh_enable+0x81/0x92
Mar 10 02:21:53 tdamac [<ffffffff80416994>] sky2_down+0x3da/0x4ff
Mar 10 02:21:53 tdamac [<ffffffff80419216>] sky2_restart+0x27/0xda
Mar 10 02:21:53 tdamac [<ffffffff804191ef>] sky2_restart+0x0/0xda
Mar 10 02:21:53 tdamac [<ffffffff80241264>] run_workqueue+0x7a/0x102
Mar 10 02:21:53 tdamac [<ffffffff80241bbd>] worker_thread+0xd5/0xe0
Mar 10 02:21:53 tdamac [<ffffffff8024442f>] autoremove_wake_function+0x0/0x2e
Mar 10 02:21:53 tdamac [<ffffffff80241ae8>] worker_thread+0x0/0xe0
Mar 10 02:21:53 tdamac [<ffffffff8024431b>] kthread+0x47/0x74
Mar 10 02:21:53 tdamac [<ffffffff8022e2cd>] schedule_tail+0x27/0x5f
Mar 10 02:21:53 tdamac [<ffffffff8020bfc9>] child_rip+0xa/0x11
Mar 10 02:21:53 tdamac [<ffffffff802442d4>] kthread+0x0/0x74
Mar 10 02:21:53 tdamac [<ffffffff8020bfbf>] child_rip+0x0/0x11
Mar 10 02:21:53 tdamac ---[ end trace 121f082195d3bb93 ]---
Mar 10 02:21:53 tdamac ------------[ cut here ]------------

Then I got some of these...

Mar 10 02:21:53 tdamac ------------[ cut here ]------------
Mar 10 02:21:53 tdamac WARNING: at net/core/dst.c:266 dst_release+0x23/0x2c()
Mar 10 02:21:53 tdamac Modules linked in: vmnet vmblock vmci vmmon i915 drm snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd
_seq snd_seq_device ipt_MASQUERADE iptable_nat nf_nat pl2303 usbserial hermes fuse cifs snd_hda_intel snd_pcm snd_timer snd snd_page_all
oc appletouch pcspkr applesmc hwmon led_class input_polldev scsi_wait_scan
Mar 10 02:21:53 tdamac Pid: 9239, comm: vmware-vmx Tainted: G        W  2.6.28-gentoo-r2 #10
Mar 10 02:21:53 tdamac Call Trace:
Mar 10 02:21:53 tdamac [<ffffffff80232dc8>] warn_on_slowpath+0x51/0x6d
Mar 10 02:21:53 tdamac [<ffffffff80288ed0>] pipe_read+0x383/0x397
Mar 10 02:21:53 tdamac [<ffffffff80532520>] memcpy_toiovec+0x36/0x66
Mar 10 02:21:53 tdamac [<ffffffff8053a1bc>] dst_release+0x23/0x2c
Mar 10 02:21:53 tdamac [<ffffffff80530921>] skb_release_head_state+0xd/0x8f
Mar 10 02:21:53 tdamac [<ffffffff8053106c>] skb_release_all+0x9/0x12
Mar 10 02:21:53 tdamac [<ffffffff80530859>] __kfree_skb+0x9/0x6f
Mar 10 02:21:53 tdamac [<ffffffffa01fa4f7>] VNetUserIf_Create+0xcb3/0xe59 [vmnet]
Mar 10 02:21:53 tdamac [<ffffffff8022caf7>] default_wake_function+0x0/0xe
Mar 10 02:21:53 tdamac [<ffffffffa01f8871>] cleanup_module+0x261/0x265 [vmnet]
Mar 10 02:21:53 tdamac [<ffffffff8028308f>] vfs_read+0xaa/0x133
Mar 10 02:21:53 tdamac [<ffffffff80283376>] sys_read+0x45/0x6e
Mar 10 02:21:53 tdamac [<ffffffff8020b0db>] system_call_fastpath+0x16/0x1b
Mar 10 02:21:53 tdamac ---[ end trace 121f082195d3bb93 ]---
Mar 10 02:21:53 tdamac ------------[ cut here ]------------
Comment 9 Trenton D. Adams 2009-03-10 01:49:13 UTC
Oh, by the way, my vmware was running with the Ethernet configured in bridged mode.

Note You need to log in before you can comment on or make changes to this bug.