Created attachment 72997 [details] config file, lspci, dmesg, lspci, version atd more debug info. System hangs when interface, that use e1000 driver, configured to receive IP from dhcp. Message: [ 242.556028] INFO: task kworker/1:3:622 blocked for more than 120 seconds. [ 242.562825] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.570661] kworker/1:3 D ffffffff8180cb40 0 622 2 0x00000000 [ 242.570675] ffff8801a172dad0 0000000000000046 ffff8801a172dfd8 00000000000137c0 [ 242.570695] ffff8801a172c010 00000000000137c0 00000000000137c0 00000000000137c0 [ 242.570716] ffff8801a172dfd8 00000000000137c0 ffff8801b69116e0 ffff8801a16344a0 [ 242.570738] Call Trace: [ 242.570755] [<ffffffff81669d39>] schedule+0x29/0x70 [ 242.570765] [<ffffffff81667f8d>] schedule_timeout+0x1fd/0x2e0 [ 242.570777] [<ffffffff8108d5ba>] ? update_curr+0x14a/0x1e0 [ 242.570788] [<ffffffff81669b8b>] wait_for_common+0xdb/0x180 [ 242.570799] [<ffffffff8108ecb8>] ? idle_balance+0xf8/0x150 [ 242.570809] [<ffffffff81086d90>] ? try_to_wake_up+0x2d0/0x2d0 [ 242.570819] [<ffffffff8166aacf>] ? _raw_spin_lock_irqsave+0x2f/0x40 [ 242.570829] [<ffffffff81669d0d>] wait_for_completion+0x1d/0x20 [ 242.570839] [<ffffffff8106fd51>] wait_on_work+0x1a1/0x1b0 [ 242.570849] [<ffffffff8106e0d0>] ? do_work_for_cpu+0x30/0x30 [ 242.570858] [<ffffffff8106fe7d>] __cancel_work_timer+0x4d/0x170 [ 242.570869] [<ffffffff810e1321>] ? synchronize_irq+0x51/0xf0 [ 242.570878] [<ffffffff8106ffd0>] cancel_work_sync+0x10/0x20 [ 242.570915] [<ffffffffa004fff5>] e1000_down_and_stop+0x25/0x50 [e1000] [ 242.570933] [<ffffffffa005554f>] e1000_down+0x14f/0x230 [e1000] [ 242.570954] [<ffffffffa0055b50>] ? e1000_change_mtu+0x1c0/0x1c0 [e1000] [ 242.570972] [<ffffffffa0055bcd>] e1000_reset_task+0x7d/0xa0 [e1000] [ 242.570983] [<ffffffff8106ecdb>] process_one_work+0x12b/0x470 [ 242.570993] [<ffffffff81071846>] worker_thread+0x176/0x420 [ 242.571002] [<ffffffff810716d0>] ? manage_workers+0x120/0x120 [ 242.571011] [<ffffffff8107639e>] kthread+0x9e/0xb0 [ 242.571023] [<ffffffff81674464>] kernel_thread_helper+0x4/0x10 [ 242.571033] [<ffffffff81076300>] ? kthread_freezable_should_stop+0x70/0x70 [ 242.571043] [<ffffffff81674460>] ? gs_change+0x13/0x13 [ 362.568027] INFO: task kworker/1:3:622 blocked for more than 120 seconds. Looks like there is a deadlock in e1000 driver. This lock happened when eth1 , that use e1000 driver, configured to receive ip dynamically, from dhcp server. No hangs happened when interface works with static ip. Same hardware is working stable with 3.0.0 kernel. Same bug reported in debian Bug#665693 http://lists.debian.org/debian-kernel/2012/03/msg00811.html. Relevant discussion in LKML: https://lkml.org/lkml/2011/11/17/434 It looks like patch from vanilla did NOT solve the problem. https://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=3a3847e007aae732d64d8fd1374126393e9879a3;hp=1032c736e81cdf490ae62f86da7efe67c3c3e61d I have tested this on ubuntu's unmodified mainline kernels v3.3 and 3.4.0-rc3 https://wiki.ubuntu.com/KernelMainlineBuilds Same problem found in Ubuntu's kernel 3.2.0 Last kernel that is working for me is 3.0.0 Last kernel t
I could not reproduce same error on old machine with single socket P4 32 bit CPU.
I can confirm this problem. On my linux box i have 2 network interfaces. Eth1 is using the e1000 driver. On Ubuntu 11.10 with kernel 3.0.0 there are no problems. After the upgrade to Ubuntu 12.04 the e1000 device is not working and on the console i get the same task kworker messages. I tried the latest 3.4.0-rc4 kernel without Ubuntu patches. The problem is still there. I have a dhclient running on eth0 (RTL8168e/8111e) for my isp cable connection. On network device eth1 (e1000) my private IPv4 range and a IPv6 range with DHCP, DHCPv6, radvd running. After the booting of the 3.2 or 3.4 rc4 kernel is completed the e1000 networking device stops working. Sometimes i see during booting the message that the e1000 device has been reset. I also attached lspci, dmesg logs and my kernelconfig.
Created attachment 73121 [details] lspci, dmesg and kernel config
FYI, I start looking into details.
Has anybody reproduced with 3.3.4 kernel?
Today i compiled the 3.3.4 kernel without ubuntu patches and the e1000 driver is still not working. Here is the dmesg: [ 241.068051] INFO: task kworker/3:1:34 blocked for more than 120 seconds. [ 241.068056] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 241.068061] kworker/3:1 D 0000000000000003 0 34 2 0x00000000 [ 241.068071] ffff8801395f9b00 0000000000000046 ffff880134c69cd8 0000000000000000 [ 241.068080] ffff880139b944d0 ffff8801395f9fd8 ffff8801395f9fd8 ffff8801395f9fd8 [ 241.068088] ffff880139b90000 ffff880139b944d0 0000000000000002 7fffffffffffffff [ 241.068095] Call Trace: [ 241.068110] [<ffffffff8164ea6f>] schedule+0x3f/0x60 [ 241.068118] [<ffffffff8164d0a5>] schedule_timeout+0x2a5/0x320 [ 241.068128] [<ffffffff81088b93>] ? dequeue_entity+0x123/0x300 [ 241.068136] [<ffffffff8164e8af>] wait_for_common+0xdf/0x180 [ 241.068143] [<ffffffff81081340>] ? try_to_wake_up+0x2c0/0x2c0 [ 241.068150] [<ffffffff8164ea2d>] wait_for_completion+0x1d/0x20 [ 241.068158] [<ffffffff8106c0d1>] wait_on_work+0x191/0x1a0 [ 241.068164] [<ffffffff8106a280>] ? do_work_for_cpu+0x30/0x30 [ 241.068171] [<ffffffff8106d63e>] __cancel_work_timer+0x8e/0x150 [ 241.068178] [<ffffffff8106d730>] cancel_work_sync+0x10/0x20 [ 241.068215] [<ffffffffa0111645>] e1000_down_and_stop+0x25/0x50 [e1000] [ 241.068230] [<ffffffffa011531f>] e1000_down+0x14f/0x200 [e1000] [ 241.068244] [<ffffffffa01181c0>] ? e1000_change_mtu+0x1c0/0x1c0 [e1000] [ 241.068258] [<ffffffffa011822e>] e1000_reset_task+0x6e/0x90 [e1000] [ 241.068266] [<ffffffff8106ceea>] process_one_work+0x11a/0x480 [ 241.068273] [<ffffffff8106dc84>] worker_thread+0x164/0x370 [ 241.068280] [<ffffffff8106db20>] ? manage_workers.isra.28+0x230/0x230 [ 241.068286] [<ffffffff81072463>] kthread+0x93/0xa0 [ 241.068293] [<ffffffff81659024>] kernel_thread_helper+0x4/0x10 [ 241.068300] [<ffffffff810723d0>] ? kthread_freezable_should_stop+0x70/0x70 [ 241.068307] [<ffffffff81659020>] ? gs_change+0x13/0x13 [ 264.804132] device eth1 entered promiscuous mode [ 302.415376] device eth1 left promiscuous mode [ 361.068045] INFO: task kworker/3:1:34 blocked for more than 120 seconds. [ 361.070780] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 361.073628] kworker/3:1 D 0000000000000003 0 34 2 0x00000000 [ 361.073646] ffff8801395f9b00 0000000000000046 ffff880134c69cd8 0000000000000000 [ 361.073676] ffff880139b944d0 ffff8801395f9fd8 ffff8801395f9fd8 ffff8801395f9fd8 [ 361.073705] ffff880139b90000 ffff880139b944d0 0000000000000002 7fffffffffffffff [ 361.073735] Call Trace: [ 361.073756] [<ffffffff8164ea6f>] schedule+0x3f/0x60 [ 361.073775] [<ffffffff8164d0a5>] schedule_timeout+0x2a5/0x320 [ 361.073796] [<ffffffff81088b93>] ? dequeue_entity+0x123/0x300 [ 361.073816] [<ffffffff8164e8af>] wait_for_common+0xdf/0x180 [ 361.073837] [<ffffffff81081340>] ? try_to_wake_up+0x2c0/0x2c0 [ 361.073856] [<ffffffff8164ea2d>] wait_for_completion+0x1d/0x20 [ 361.073876] [<ffffffff8106c0d1>] wait_on_work+0x191/0x1a0 [ 361.073894] [<ffffffff8106a280>] ? do_work_for_cpu+0x30/0x30 [ 361.073913] [<ffffffff8106d63e>] __cancel_work_timer+0x8e/0x150 [ 361.073933] [<ffffffff8106d730>] cancel_work_sync+0x10/0x20 [ 361.073978] [<ffffffffa0111645>] e1000_down_and_stop+0x25/0x50 [e1000] [ 361.074006] [<ffffffffa011531f>] e1000_down+0x14f/0x200 [e1000] [ 361.074034] [<ffffffffa01181c0>] ? e1000_change_mtu+0x1c0/0x1c0 [e1000] [ 361.074062] [<ffffffffa011822e>] e1000_reset_task+0x6e/0x90 [e1000] [ 361.074083] [<ffffffff8106ceea>] process_one_work+0x11a/0x480 [ 361.074103] [<ffffffff8106dc84>] worker_thread+0x164/0x370 [ 361.074122] [<ffffffff8106db20>] ? manage_workers.isra.28+0x230/0x230 [ 361.074142] [<ffffffff81072463>] kthread+0x93/0xa0 [ 361.074160] [<ffffffff81659024>] kernel_thread_helper+0x4/0x10 [ 361.074180] [<ffffffff810723d0>] ? kthread_freezable_should_stop+0x70/0x70 [ 361.074200] [<ffffffff81659020>] ? gs_change+0x13/0x13
(In reply to comment #6) > Today i compiled the 3.3.4 kernel without ubuntu patches and the e1000 driver > is still not working. Here is the dmesg: > > [ 241.068051] INFO: task kworker/3:1:34 blocked for more than 120 seconds. > [ 241.068056] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this > message. > [ 241.068061] kworker/3:1 D 0000000000000003 0 34 2 > 0x00000000 > [ 241.068071] ffff8801395f9b00 0000000000000046 ffff880134c69cd8 > 0000000000000000 > [ 241.068080] ffff880139b944d0 ffff8801395f9fd8 ffff8801395f9fd8 > ffff8801395f9fd8 > [ 241.068088] ffff880139b90000 ffff880139b944d0 0000000000000002 > 7fffffffffffffff > [ 241.068095] Call Trace: > [ 241.068110] [<ffffffff8164ea6f>] schedule+0x3f/0x60 > [ 241.068118] [<ffffffff8164d0a5>] schedule_timeout+0x2a5/0x320 > [ 241.068128] [<ffffffff81088b93>] ? dequeue_entity+0x123/0x300 > [ 241.068136] [<ffffffff8164e8af>] wait_for_common+0xdf/0x180 > [ 241.068143] [<ffffffff81081340>] ? try_to_wake_up+0x2c0/0x2c0 > [ 241.068150] [<ffffffff8164ea2d>] wait_for_completion+0x1d/0x20 > [ 241.068158] [<ffffffff8106c0d1>] wait_on_work+0x191/0x1a0 > [ 241.068164] [<ffffffff8106a280>] ? do_work_for_cpu+0x30/0x30 > [ 241.068171] [<ffffffff8106d63e>] __cancel_work_timer+0x8e/0x150 > [ 241.068178] [<ffffffff8106d730>] cancel_work_sync+0x10/0x20 > [ 241.068215] [<ffffffffa0111645>] e1000_down_and_stop+0x25/0x50 [e1000] > [ 241.068230] [<ffffffffa011531f>] e1000_down+0x14f/0x200 [e1000] > [ 241.068244] [<ffffffffa01181c0>] ? e1000_change_mtu+0x1c0/0x1c0 [e1000] > [ 241.068258] [<ffffffffa011822e>] e1000_reset_task+0x6e/0x90 [e1000] > [ 241.068266] [<ffffffff8106ceea>] process_one_work+0x11a/0x480 > [ 241.068273] [<ffffffff8106dc84>] worker_thread+0x164/0x370 > [ 241.068280] [<ffffffff8106db20>] ? manage_workers.isra.28+0x230/0x230 > [ 241.068286] [<ffffffff81072463>] kthread+0x93/0xa0 > [ 241.068293] [<ffffffff81659024>] kernel_thread_helper+0x4/0x10 > [ 241.068300] [<ffffffff810723d0>] ? > kthread_freezable_should_stop+0x70/0x70 > [ 241.068307] [<ffffffff81659020>] ? gs_change+0x13/0x13 > [ 264.804132] device eth1 entered promiscuous mode > [ 302.415376] device eth1 left promiscuous mode > [ 361.068045] INFO: task kworker/3:1:34 blocked for more than 120 seconds. > [ 361.070780] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this > message. > [ 361.073628] kworker/3:1 D 0000000000000003 0 34 2 > 0x00000000 > [ 361.073646] ffff8801395f9b00 0000000000000046 ffff880134c69cd8 > 0000000000000000 > [ 361.073676] ffff880139b944d0 ffff8801395f9fd8 ffff8801395f9fd8 > ffff8801395f9fd8 > [ 361.073705] ffff880139b90000 ffff880139b944d0 0000000000000002 > 7fffffffffffffff > [ 361.073735] Call Trace: > [ 361.073756] [<ffffffff8164ea6f>] schedule+0x3f/0x60 > [ 361.073775] [<ffffffff8164d0a5>] schedule_timeout+0x2a5/0x320 > [ 361.073796] [<ffffffff81088b93>] ? dequeue_entity+0x123/0x300 > [ 361.073816] [<ffffffff8164e8af>] wait_for_common+0xdf/0x180 > [ 361.073837] [<ffffffff81081340>] ? try_to_wake_up+0x2c0/0x2c0 > [ 361.073856] [<ffffffff8164ea2d>] wait_for_completion+0x1d/0x20 > [ 361.073876] [<ffffffff8106c0d1>] wait_on_work+0x191/0x1a0 > [ 361.073894] [<ffffffff8106a280>] ? do_work_for_cpu+0x30/0x30 > [ 361.073913] [<ffffffff8106d63e>] __cancel_work_timer+0x8e/0x150 > [ 361.073933] [<ffffffff8106d730>] cancel_work_sync+0x10/0x20 > [ 361.073978] [<ffffffffa0111645>] e1000_down_and_stop+0x25/0x50 [e1000] > [ 361.074006] [<ffffffffa011531f>] e1000_down+0x14f/0x200 [e1000] > [ 361.074034] [<ffffffffa01181c0>] ? e1000_change_mtu+0x1c0/0x1c0 [e1000] > [ 361.074062] [<ffffffffa011822e>] e1000_reset_task+0x6e/0x90 [e1000] > [ 361.074083] [<ffffffff8106ceea>] process_one_work+0x11a/0x480 > [ 361.074103] [<ffffffff8106dc84>] worker_thread+0x164/0x370 > [ 361.074122] [<ffffffff8106db20>] ? manage_workers.isra.28+0x230/0x230 > [ 361.074142] [<ffffffff81072463>] kthread+0x93/0xa0 > [ 361.074160] [<ffffffff81659024>] kernel_thread_helper+0x4/0x10 > [ 361.074180] [<ffffffff810723d0>] ? > kthread_freezable_should_stop+0x70/0x70 > [ 361.074200] [<ffffffff81659020>] ? gs_change+0x13/0x13 I have 3.3.4 installed however I don't have this issue occurring on my system. Is there any error message driver logs in dmesg log right before printing call trace with kworker?
I don't see much in dmesg log (attached) but do see MTU gets change. "eth1 changing MTU from 1500 to 9014" Is there anything in your system's network config that help causing this issue to occur that you want to share?
This linux box is used for iscsi,smbd, IPv4 NAT and IPv6 router (with tunnel to a IPv6 Broker). eth0 uses dhclient for dynamic ip adresses on my internet cable modem (public address). eth1 is my internal network. For IPv4 clients on eth1 i use dhcpd. For IPv6 radvd and dhcpd is running on eth1. For the IPv6 tunnel i use aiccu to a dutch tunnel broker. network interfaces eth0: RTL8168e/8111e at 0xffffc90000650000, 1c:6f:65:5d:06:82, XID 0c200000 IRQ 42 eth1: (PCI:33MHz:32-bit) 00:1b:21:8d:b9:b1 eth1: Intel(R) PRO/1000 Network Connection ifconfig: eth0 Link encap:Ethernet HWaddr 1c:6f:65:5d:06:82 inet addr:94.209.xxx.xxx Bcast:255.255.255.255 Mask:255.255.248.0 UP BROADCAST RUNNING MULTICAST MTU:576 Metric:1 RX packets:167095 errors:0 dropped:0 overruns:0 frame:0 TX packets:96044 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:150744445 (150.7 MB) TX bytes:8718928 (8.7 MB) Interrupt:42 eth2 Link encap:Ethernet HWaddr 9c:eb:e8:04:7c:31 inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::9eeb:e8ff:fe04:7c31/64 Scope:Link inet6 addr: 2001:xxxx:xxxx::xxxx/64 Scope:Global UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:162151 errors:0 dropped:0 overruns:0 frame:0 TX packets:314136 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:14495563 (14.4 MB) TX bytes:384336984 (384.3 MB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:1207 errors:0 dropped:0 overruns:0 frame:0 TX packets:1207 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:101145 (101.1 KB) TX bytes:101145 (101.1 KB) sixxs Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 -00 inet6 addr: fe80::4b8:2ff:2fa:2/64 Scope:Link inet6 addr: 2001:7b8:xxxx:xxxx::xxxx/64 Scope:Global UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1280 Metric:1 RX packets:4000 errors:0 dropped:0 overruns:0 frame:0 TX packets:2649 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:3393260 (3.3 MB) TX bytes:791482 (791.4 KB) route -n 0.0.0.0 94.209.xxxx.1 0.0.0.0 UG 100 0 0 eth0 94.209.xxxx.0 0.0.0.0 255.255.248.0 U 0 0 0 eth0 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 route -6 -n 2001:7b8:xxxx:xxxx::/64 :: U 256 0 1 sixxs 2001:7b8:xxxx::/64 :: U 256 0 0 eth1 fe80::/64 :: U 256 0 0 eth1 fe80::/64 :: U 256 0 0 sixxs ::/0 2001:7b8:2ff:2fa::1 UG 1024 0 0 sixxs ::/0 :: !n -1 1 4476 lo ::1/128 :: Un 0 1 43 lo 2001:7b8:xxxx:xxxx::/128 :: Un 0 1 0 lo 2001:7b8:xxxx:xxxx::2/128 :: Un 0 1 2042 lo 2001:7b8:xxxx::/128 :: Un 0 1 0 lo 2001:7b8:xxxx::xxxx/128 :: Un 0 1 432 lo fe80::/128 :: Un 0 1 0 lo fe80::/128 :: Un 0 1 0 lo fe80::4b8:2ff:2fa:2/128 :: Un 0 1 0 lo fe80::9eeb:e8ff:fe04:7c31/128 :: Un 0 1 61 lo ff00::/8 :: U 256 0 0 eth1 ff00::/8 :: U 256 0 0 sixxs ::/0 :: !n -1 1 4476 lo
The eth2 in my ifconfig is a temporary network card right. Normally it is on the Intel e1000 network card on eth1.
Deadlock happened to me only when I use DHCP client to get an IPv4 address from DHCP server (same reported in debian Bug#665693). Ubuntu (and Debian, I think) use dhclient3 for dynamic IP configuration which is part of isc-dhcp-client package. dpkg -s isc-dhcp-client Package: isc-dhcp-client Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com> Source: isc-dhcp Version: 4.1.ESV-R4-0ubuntu5 Provides: dhcp3-client Command line that is running: dhclient3 -e IF_METRIC=100 -pf /var/run/dhclient.eth1.pid -lf /var/lib/dhcp/dhclient.eth1.leases -1 eth1 Pay attention that same user space utilities and configuration works properly with 3.0.0-17 kernel.
Can somebody upload the lspci -vvv content *after* issue occur?
I noticed (and it was not in my ifconfig output) that with the default 1500 mtu there is no problem. With mtu 9014 i get these problems. mtu 9014 always worked with kernel 3.0 and before. lspci -vvv output after the issue (with 9014 mtu) will be here in a moment.
Created attachment 73174 [details] lspci -vvv after the problem with mtu 9014 and kernel 3.4.0
(In reply to comment #13) > I noticed (and it was not in my ifconfig output) that with the default 1500 > mtu > there is no problem. With mtu 9014 i get these problems. mtu 9014 always > worked > with kernel 3.0 and before. lspci -vvv output after the issue (with 9014 mtu) > will be here in a moment. This is good to know. As I suspected. I am going to try repro again then with MTU 9014.
no repro on 3.3.4 with MTU=9014. I will burn Ubuntu 12.04 Desktop 64bit tomorrow and will try repro.
Created attachment 73178 [details] lspci -vvv from working system, kernel 3.0.0-17
Created attachment 73179 [details] the lspci -vvv content *after* issue occur, kernel 3.4.0-030400rc3
Pay attention to the message about adapter reset in dmesg: e1000 0000:04:02.1: eth1: Reset adapter Maybe related to the problem.
(In reply to comment #19) > Pay attention to the message about adapter reset in dmesg: > > e1000 0000:04:02.1: eth1: Reset adapter > > Maybe related to the problem. Alright this makes more sense now. Looks like reset does not complete successfully. I think it would make more sense to find out why reset occurs. DO you have more info in dmesg log about reset? Is there Tx hang messages in dmesg log?
(In reply to comment #20) > (In reply to comment #19) > > Pay attention to the message about adapter reset in dmesg: > > > > e1000 0000:04:02.1: eth1: Reset adapter > > > > Maybe related to the problem. > > Alright this makes more sense now. Looks like reset does not complete > successfully. I think it would make more sense to find out why reset occurs. > DO > you have more info in dmesg log about reset? Is there Tx hang messages in > dmesg > log? Igor, I think I see what's going on. I am not in front of my linux box right now. I will send you patch to test tomorrow morning.
No, there is no "TX hang" message in dmesg. BTW all messages attached.
Created attachment 73189 [details] e1000_main.c.patch test patch for the issue.
I have attached a patch - e1000_main.c.patch - for testing. Please try this patch.
Created attachment 73190 [details] dmesg I have compiled kernel 3.4.0-rc5 with the e1000_main.c.patch and it looks like that jumbo frames is working again now. When i enable jumbo frames the network device is still working and i don't get any task kworker messages. In my dmesg you will still see a [ 9.348211] e1000 0000:03:00.0: eth1: Reset adapter message but the patch prevents a deadlock. I also try to compile this on the Kernel 3.2 source with Ubuntu patches. My iscsitarget does not compile on the 3.4.0-rc5 headers. Within a couple of hours i can give you that result as well.
I can confirm that patch is working. No more blocked kworker task. Interface receives ip from dhcp, working as expected and stable. I have tested 3.4.0-rc3 kernel. Tushar, thank you a lot!
This patch is also working on the 3.2.14 kernel. And also a big thanks from me Tushar!
A patch referencing this bug report has been merged in Linux v3.4: commit 8ce6909f77ba1b7bcdea65cc2388fd1742b6d669 Author: Tushar Dave <tushar.n.dave@intel.com> Date: Thu May 17 01:04:50 2012 +0000 e1000: Prevent reset task killing itself.
A patch referencing this bug report has been merged in Linux v3.4: commit 39c2028531332cab1325637c2100f3189fa1be72 Merge: 5c7dd71 8ce6909 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu May 17 16:30:26 2012 -0700