Bug 9124 - Netconsole race crashed the system
Summary: Netconsole race crashed the system
Status: REJECTED INVALID
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Arnaldo Carvalho de Melo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-10-04 16:24 UTC by Tina Yang
Modified: 2008-09-26 05:27 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.23rc8
Tree: Mainline
Regression: ---


Attachments

Description Tina Yang 2007-10-04 16:24:15 UTC
Most recent kernel where this bug did not occur:
Think the problem has always been there.
Distribution:
Hardware Environment:
DELL PowerEdge 2650 (x86)
DELL PowerEdge 2850(x86_64)
HP ProLiant DL380 G5 (x86_64) 
with various NICs - e1000, tg3, bnx2
Software Environment:
2.6.9, 2.6.18, 2.6.23
Problem Description:
On 2.6.18 found this issue on e1000 and tg3. On mainline 2.6.23-rc* found this
 issue on e100,tgs and bnx2.  It either panicked
at netdevice.h:890 or hung the system, and sometimes depending
on which NIC are used, the following console message,
 e1000:
      "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang"
 tg3:
      "NETDEV WATCHDOG: eth4: transmit timed out"
      "tg3: eth4: transmit timed out, resetting"

Steps to reproduce:
1. On 2.6.18 (both x86_x86_64) insert netconsole module.(NIC: e1000 and tg3)
2. Run a moderate io load , preferably fio - one process doing async+directIO using libaio 

fio jobfile:
[global]
iodepth=1024
iodepth_batch=60
randrepeat=1
size=1024m
directory=/home/oracle
numjobs=2
[job1]
bs=8k
direct=1
ioengine=libaio
rw=randrw
filename=file1:file2

3. From second console as root do " echo t > /proc/sysrq-trigger"

Machine will instantly hang.


Crash stack captured on 2.6.9
       PANIC: "kernel BUG at include/linux/netdevice.h:888!"
#0 [ 23c5e60] disk_dump at f9ca71a2
#1 [ 23c5e64] printk at 21228d6
#2 [ 23c5e70] freeze_other_cpus at f9ca6ef5
#3 [ 23c5e80] start_disk_dump at f9ca6fa0
#4 [ 23c5e90] try_crashdump at 2133766
#5 [ 23c5e98] die at 2106354
#6 [ 23c5ecc] do_invalid_op at 210672f
#7 [ 23c5f7c] error_code (via invalid_op) at fffecede
   EAX: 00000006  EBX: 00200202  ECX: 00000000  EDX: df287000  EBP: e05ca000
   DS:  007b      ESI: 00000001  ES:  007b      EDI: e05ca240 
   CS:  0060      EIP: f8c82a08  ERR: ffffffff  EFLAGS: 00210046 
#8 [ 23c5fb8] tg3_poll at f8c82a08
#9 [ 23c5fd0] net_rx_action at 227a8da
#10 [ 23c5fe8] __do_softirq at 2126422
--- <soft IRQ> ---
#0 [25c71cac] do_softirq at 2108460
#1 [25c71cb4] dev_queue_xmit at 227a0d2
#2 [25c71ccc] ip_finish_output at 229288d
#3 [25c71ce4] ip_queue_xmit at 2292fa9
#4 [25c71dac] tcp_transmit_skb at 22a0ff7
#5 [25c71dec] tcp_write_xmit at 22a1901
#6 [25c71e10] tcp_sendmsg at 2297d6d
#7 [25c71e80] sock_aio_write at 2272512
#8 [25c71eec] do_sync_write at 215a444
#9 [25c71f88] vfs_write at 215a53a
#10 [25c71fa4] sys_write at 215a5f4
#11 [25c71fc0] system_call at fffec219 

net_device in memory,
  name = "eth0\000\000\000\000\000\000\000\000\000\000\000", 
  mem_end = 0, 
  mem_start = 0, 
  base_addr = 0, 
  irq = 209, 
  if_port = 0 '\0', 
  dma = 0 '\0', 
  state = 6, 
  next = 0xbf41b000, 
  init = 0, 
  next_sched = 0x0, 
  ifindex = 2, 
  iflink = 2, 
  get_stats = 0xf8c87737, 
  get_wireless_stats = 0, 
  wireless_handlers = 0x0, 
  ethtool_ops = 0xf8c964e0, 
  trans_start = 128269465, 
  last_rx = 128269464, 
  flags = 4099, 
  gflags = 0, 
  priv_flags = 32, 
  unused_alignment_fixer = 0, 
  mtu = 1500, 
  type = 1, 
  hard_header_len = 14, 
  priv = 0xbf430240, 
  master = 0x0, 
  broadcast = "<FF><FF><FF><FF><FF><FF>\000\000\000\000\000\000\000\000\000\000\000
\000\000\000\00
0\000\000\000\000\000\000\000\000\000\000", 
  dev_addr = "\000\tk<E6>g<EB>\000\000\000\000\000\000\000\000\000\000\000\0
00\000\000\000\000\000
\000\000\000\000\000\000\000\000", 
  addr_len = 6 '\006', 
  reserved = 0 '\0', 
  priv_len = 1980, 
  mc_list = 0x15f48440, 
  mc_count = 1, 
  promiscuity = 0, 
  allmulti = 0, 
  watchdog_timeo = 5000, 
  watchdog_timer = {
    entry = {
      next = 0x1594af48, 
      prev = 0x1594af48
    }, 
    expires = 128269531, 
    lock = {
      lock = 1, 
      magic = 3735899821
    }, 
    magic = 1267182958, 
    function = 0x2286c74 <dev_watchdog>, 
    data = 3208839168, 
    base = 0x1594a860
  }, 
  atalk_ptr = 0x0, 
  ip_ptr = 0xc1e7de80, 
  dn_ptr = 0x0, 
  ip6_ptr = 0x0, 
  ec_ptr = 0x0, 
  ax25_ptr = 0x0, 
  poll_list = {
    next = 0x100100, 
    prev = 0x200200
  }, 
 ...


Crash stack captured on 2.6.18
       PANIC: "kernel BUG at include/linux/netdevice.h:890!"
 #0 [c072ce30] crash_kexec at c044418a
 #1 [c072ce74] die at c04054d0
 #2 [c072cea4] do_invalid_op at c0405c20
 #3 [c072cf54] error_code (via invalid_op) at c0404ab3
    EAX: 00000007  EBX: 00000202  ECX: 00000000  EDX: f6d9c000  EBP: f6d9c400 
    DS:  007b      ESI: 00000001  ES:  007b      EDI: cb02b280 
    CS:  0060      EIP: f8927791  ERR: ffffffff  EFLAGS: 00010046 
 #4 [c072cf88] tg3_poll at f8927791
--- <soft IRQ> ---
 #0 [f7e54f60] do_softirq at c0406433
 #1 [f7e54f6c] do_IRQ at c0406425
 #2 [f7e54fb4] cpu_idle at c0402c8e

net_device in memory,
  name = "eth4\000\000\000\000\000\000\000\000\000\000\000", 
  name_hlist = {
    next = 0x0, 
    pprev = 0xc07d0148
  }, 
  mem_end = 0, 
  mem_start = 0, 
  base_addr = 0, 
  irq = 201, 
  if_port = 0 '\0', 
  dma = 0 '\0', 
  state = 39, 
  next = 0xf7387000, 
  init = 0, 
  features = 419, 
  next_sched = 0x0, 
  ifindex = 2, 
  iflink = 2, 
  get_stats = 0xf892016b <tg3_get_stats>, 
  get_wireless_stats = 0, 
  wireless_handlers = 0x0, 
  wireless_data = 0x0, 
  cfg80211_wext_pending_config = 0x0, 
  ethtool_ops = 0xf89301a0, 
  flags = 4099, 
  priv_flags = 0, 
  padded = 0, 
  operstate = 6 '\006', 
  link_mode = 0 '\0', 
  mtu = 1500, 
  type = 1, 
  hard_header_len = 14, 
  master = 0x0, 
  perm_addr = "\000\021C5\033\004\000\000\000\000\000\000\000\000\000\000\000\000\000\000\
000\000\000\000\000\000\000\000\000\000\000", 
  addr_len = 6 '\006', 
  dev_id = 0, 
  mc_list = 0xf59f0ac0, 
  mc_count = 5, 
  promiscuity = 0, 
  allmulti = 0, 
  atalk_ptr = 0x0, 
  ip_ptr = 0xcb308280, 
  dn_ptr = 0x0, 
  ip6_ptr = 0xf71e5c00, 
  ec_ptr = 0x0, 
  ax25_ptr = 0x0, 
  ieee80211_ptr = 0x0, 
  poll_list = {
    next = 0xcb0232a0, 
    prev = 0xcb0232a0
  },
  ...
Comment 1 Anonymous Emailer 2007-10-04 16:43:45 UTC
Reply-To: akpm@linux-foundation.org


(Please resoind by emailed reply-to-all, not via the bugzilla web interface)

On Thu,  4 Oct 2007 16:24:18 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9124
> 
>            Summary: Netconsole race crashed the system
>            Product: Networking
>            Version: 2.5
>      KernelVersion: 2.6.9, 2.6.18, 2.6.23
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Other
>         AssignedTo: acme@ghostprotocols.net
>         ReportedBy: tina.yang@oracle.com
> 
> 
> Most recent kernel where this bug did not occur:
> Think the problem has always been there.
> Distribution:
> Hardware Environment:
> DELL PowerEdge 2650 (x86)
> DELL PowerEdge 2850(x86_64)
> HP ProLiant DL380 G5 (x86_64) 
> with various NICs - e1000, tg3, bnx2
> Software Environment:
> 2.6.9, 2.6.18, 2.6.23
> Problem Description:
> On 2.6.18 found this issue on e1000 and tg3. On mainline 2.6.23-rc* found
> this
>  issue on e100,tgs and bnx2.  It either panicked
> at netdevice.h:890 or hung the system, and sometimes depending
> on which NIC are used, the following console message,
>  e1000:
>       "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang"
>  tg3:
>       "NETDEV WATCHDOG: eth4: transmit timed out"
>       "tg3: eth4: transmit timed out, resetting"
> 
> Steps to reproduce:
> 1. On 2.6.18 (both x86_x86_64) insert netconsole module.(NIC: e1000 and tg3)
> 2. Run a moderate io load , preferably fio - one process doing async+directIO
> using libaio 
> 
> fio jobfile:
> [global]
> iodepth=1024
> iodepth_batch=60
> randrepeat=1
> size=1024m
> directory=/home/oracle
> numjobs=2
> [job1]
> bs=8k
> direct=1
> ioengine=libaio
> rw=randrw
> filename=file1:file2
> 
> 3. From second console as root do " echo t > /proc/sysrq-trigger"
> 
> Machine will instantly hang.
> 
> 
> Crash stack captured on 2.6.9
>        PANIC: "kernel BUG at include/linux/netdevice.h:888!"
> #0 [ 23c5e60] disk_dump at f9ca71a2
> #1 [ 23c5e64] printk at 21228d6
> #2 [ 23c5e70] freeze_other_cpus at f9ca6ef5
> #3 [ 23c5e80] start_disk_dump at f9ca6fa0
> #4 [ 23c5e90] try_crashdump at 2133766
> #5 [ 23c5e98] die at 2106354
> #6 [ 23c5ecc] do_invalid_op at 210672f
> #7 [ 23c5f7c] error_code (via invalid_op) at fffecede
>    EAX: 00000006  EBX: 00200202  ECX: 00000000  EDX: df287000  EBP: e05ca000
>    DS:  007b      ESI: 00000001  ES:  007b      EDI: e05ca240 
>    CS:  0060      EIP: f8c82a08  ERR: ffffffff  EFLAGS: 00210046 
> #8 [ 23c5fb8] tg3_poll at f8c82a08
> #9 [ 23c5fd0] net_rx_action at 227a8da
> #10 [ 23c5fe8] __do_softirq at 2126422
> --- <soft IRQ> ---
> #0 [25c71cac] do_softirq at 2108460
> #1 [25c71cb4] dev_queue_xmit at 227a0d2
> #2 [25c71ccc] ip_finish_output at 229288d
> #3 [25c71ce4] ip_queue_xmit at 2292fa9
> #4 [25c71dac] tcp_transmit_skb at 22a0ff7
> #5 [25c71dec] tcp_write_xmit at 22a1901
> #6 [25c71e10] tcp_sendmsg at 2297d6d
> #7 [25c71e80] sock_aio_write at 2272512
> #8 [25c71eec] do_sync_write at 215a444
> #9 [25c71f88] vfs_write at 215a53a
> #10 [25c71fa4] sys_write at 215a5f4
> #11 [25c71fc0] system_call at fffec219 
> 
> net_device in memory,
>   name = "eth0\000\000\000\000\000\000\000\000\000\000\000", 
>  ...
> 
> 
> Crash stack captured on 2.6.18
>        PANIC: "kernel BUG at include/linux/netdevice.h:890!"
>  #0 [c072ce30] crash_kexec at c044418a
>  #1 [c072ce74] die at c04054d0
>  #2 [c072cea4] do_invalid_op at c0405c20
>  #3 [c072cf54] error_code (via invalid_op) at c0404ab3
>     EAX: 00000007  EBX: 00000202  ECX: 00000000  EDX: f6d9c000  EBP: f6d9c400 
>     DS:  007b      ESI: 00000001  ES:  007b      EDI: cb02b280 
>     CS:  0060      EIP: f8927791  ERR: ffffffff  EFLAGS: 00010046 
>  #4 [c072cf88] tg3_poll at f8927791
> --- <soft IRQ> ---
>  #0 [f7e54f60] do_softirq at c0406433
>  #1 [f7e54f6c] do_IRQ at c0406425
>  #2 [f7e54fb4] cpu_idle at c0402c8e
> 
> net_device in memory,
>   name = "eth4\000\000\000\000\000\000\000\000\000\000\000", 
>   name_hlist = {
>     next = 0x0, 
>     pprev = 0xc07d0148
>   }, 
>   ...
> 

OK, but in my 2.6.18, include/linux/netdevice.h:890 is a
local_irq_restore() in netif_rx_complete().  I don't see how that can go
BUG.

Does your 2.6.18 have any patches applied?

Please tell us what is at include/linux/netdevice.h:890 in your 2.6.18
tree.
Comment 2 Tina Yang 2007-10-04 16:48:46 UTC
The postmortem vmcore analysis points to a race between net_rx_action and 
netpoll, and disabling the following code segment cures all problems.

netpoll.c
   178         /* Process pending work on NIC */
   179         np->dev->poll_controller(np->dev);
   180         if (np->dev->poll)
   181                 poll_napi(np);

There seems to be several race windows in poll_controller() and poll_napi() 
and fixing them probably has consequence on overall system performance.
Maybe this code should only run when the machine is single-threaded ? 
Comment 3 Tina Yang 2007-10-04 16:59:14 UTC
Andrew Morton wrote:
> (Please resoind by emailed reply-to-all, not via the bugzilla web interface)
>
> On Thu,  4 Oct 2007 16:24:18 -0700 (PDT)
> bugme-daemon@bugzilla.kernel.org wrote:
>
>   
>> http://bugzilla.kernel.org/show_bug.cgi?id=9124
>>
>>            Summary: Netconsole race crashed the system
>>            Product: Networking
>>            Version: 2.5
>>      KernelVersion: 2.6.9, 2.6.18, 2.6.23
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: high
>>           Priority: P1
>>          Component: Other
>>         AssignedTo: acme@ghostprotocols.net
>>         ReportedBy: tina.yang@oracle.com
>>
>>
>> Most recent kernel where this bug did not occur:
>> Think the problem has always been there.
>> Distribution:
>> Hardware Environment:
>> DELL PowerEdge 2650 (x86)
>> DELL PowerEdge 2850(x86_64)
>> HP ProLiant DL380 G5 (x86_64) 
>> with various NICs - e1000, tg3, bnx2
>> Software Environment:
>> 2.6.9, 2.6.18, 2.6.23
>> Problem Description:
>> On 2.6.18 found this issue on e1000 and tg3. On mainline 2.6.23-rc* found
>> this
>>  issue on e100,tgs and bnx2.  It either panicked
>> at netdevice.h:890 or hung the system, and sometimes depending
>> on which NIC are used, the following console message,
>>  e1000:
>>       "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang"
>>  tg3:
>>       "NETDEV WATCHDOG: eth4: transmit timed out"
>>       "tg3: eth4: transmit timed out, resetting"
>>
>> Steps to reproduce:
>> 1. On 2.6.18 (both x86_x86_64) insert netconsole module.(NIC: e1000 and tg3)
>> 2. Run a moderate io load , preferably fio - one process doing
>> async+directIO
>> using libaio 
>>
>> fio jobfile:
>> [global]
>> iodepth=1024
>> iodepth_batch=60
>> randrepeat=1
>> size=1024m
>> directory=/home/oracle
>> numjobs=2
>> [job1]
>> bs=8k
>> direct=1
>> ioengine=libaio
>> rw=randrw
>> filename=file1:file2
>>
>> 3. From second console as root do " echo t > /proc/sysrq-trigger"
>>
>> Machine will instantly hang.
>>
>>
>> Crash stack captured on 2.6.9
>>        PANIC: "kernel BUG at include/linux/netdevice.h:888!"
>> #0 [ 23c5e60] disk_dump at f9ca71a2
>> #1 [ 23c5e64] printk at 21228d6
>> #2 [ 23c5e70] freeze_other_cpus at f9ca6ef5
>> #3 [ 23c5e80] start_disk_dump at f9ca6fa0
>> #4 [ 23c5e90] try_crashdump at 2133766
>> #5 [ 23c5e98] die at 2106354
>> #6 [ 23c5ecc] do_invalid_op at 210672f
>> #7 [ 23c5f7c] error_code (via invalid_op) at fffecede
>>    EAX: 00000006  EBX: 00200202  ECX: 00000000  EDX: df287000  EBP: e05ca000
>>    DS:  007b      ESI: 00000001  ES:  007b      EDI: e05ca240 
>>    CS:  0060      EIP: f8c82a08  ERR: ffffffff  EFLAGS: 00210046 
>> #8 [ 23c5fb8] tg3_poll at f8c82a08
>> #9 [ 23c5fd0] net_rx_action at 227a8da
>> #10 [ 23c5fe8] __do_softirq at 2126422
>> --- <soft IRQ> ---
>> #0 [25c71cac] do_softirq at 2108460
>> #1 [25c71cb4] dev_queue_xmit at 227a0d2
>> #2 [25c71ccc] ip_finish_output at 229288d
>> #3 [25c71ce4] ip_queue_xmit at 2292fa9
>> #4 [25c71dac] tcp_transmit_skb at 22a0ff7
>> #5 [25c71dec] tcp_write_xmit at 22a1901
>> #6 [25c71e10] tcp_sendmsg at 2297d6d
>> #7 [25c71e80] sock_aio_write at 2272512
>> #8 [25c71eec] do_sync_write at 215a444
>> #9 [25c71f88] vfs_write at 215a53a
>> #10 [25c71fa4] sys_write at 215a5f4
>> #11 [25c71fc0] system_call at fffec219 
>>
>> net_device in memory,
>>   name = "eth0\000\000\000\000\000\000\000\000\000\000\000", 
>>  ...
>>
>>
>> Crash stack captured on 2.6.18
>>        PANIC: "kernel BUG at include/linux/netdevice.h:890!"
>>  #0 [c072ce30] crash_kexec at c044418a
>>  #1 [c072ce74] die at c04054d0
>>  #2 [c072cea4] do_invalid_op at c0405c20
>>  #3 [c072cf54] error_code (via invalid_op) at c0404ab3
>>     EAX: 00000007  EBX: 00000202  ECX: 00000000  EDX: f6d9c000  EBP:
>>     f6d9c400 
>>     DS:  007b      ESI: 00000001  ES:  007b      EDI: cb02b280 
>>     CS:  0060      EIP: f8927791  ERR: ffffffff  EFLAGS: 00010046 
>>  #4 [c072cf88] tg3_poll at f8927791
>> --- <soft IRQ> ---
>>  #0 [f7e54f60] do_softirq at c0406433
>>  #1 [f7e54f6c] do_IRQ at c0406425
>>  #2 [f7e54fb4] cpu_idle at c0402c8e
>>
>> net_device in memory,
>>   name = "eth4\000\000\000\000\000\000\000\000\000\000\000", 
>>   name_hlist = {
>>     next = 0x0, 
>>     pprev = 0xc07d0148
>>   }, 
>>   ...
>>
>>     
>
> OK, but in my 2.6.18, include/linux/netdevice.h:890 is a
> local_irq_restore() in netif_rx_complete().  I don't see how that can go
> BUG.
>
> Does your 2.6.18 have any patches applied?
>
> Please tell us what is at include/linux/netdevice.h:890 in your 2.6.18
> tree.
>
>   
Hi, here is the segment of the file,  thanks.
    880 /* Remove interface from poll list: it must be in the poll list
    881  * on current cpu. This primitive is called by dev->poll(), when
    882  * it completes the work. The device cannot be out of poll list 
at this
    883  * moment, it is BUG().
    884  */
    885 static inline void netif_rx_complete(struct net_device *dev)
    886 {
    887         unsigned long flags;
    888
    889         local_irq_save(flags);
    890         BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
    891         list_del(&dev->poll_list);
    892         smp_mb__before_clear_bit();
    893         clear_bit(__LINK_STATE_RX_SCHED, &dev->state);
    894         local_irq_restore(flags);
    895 }
    896

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Andrew Morton wrote:
<blockquote cite="mid:20071004164343.ca01c06b.akpm@linux-foundation.org"
 type="cite">
  <pre wrap="">(Please resoind by emailed reply-to-all, not via the bugzilla web interface)

On Thu,  4 Oct 2007 16:24:18 -0700 (PDT)
<a class="moz-txt-link-abbreviated" href="mailto:bugme-daemon@bugzilla.kernel.org">bugme-daemon@bugzilla.kernel.org</a> wrote:

  </pre>
  <blockquote type="cite">
    <pre wrap=""><a class="moz-txt-link-freetext" href="http://bugzilla.kernel.org/show_bug.cgi?id=9124">http://bugzilla.kernel.org/show_bug.cgi?id=9124</a>

           Summary: Netconsole race crashed the system
           Product: Networking
           Version: 2.5
     KernelVersion: 2.6.9, 2.6.18, 2.6.23
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: Other
        AssignedTo: <a class="moz-txt-link-abbreviated" href="mailto:acme@ghostprotocols.net">acme@ghostprotocols.net</a>
        ReportedBy: <a class="moz-txt-link-abbreviated" href="mailto:tina.yang@oracle.com">tina.yang@oracle.com</a>


Most recent kernel where this bug did not occur:
Think the problem has always been there.
Distribution:
Hardware Environment:
DELL PowerEdge 2650 (x86)
DELL PowerEdge 2850(x86_64)
HP ProLiant DL380 G5 (x86_64) 
with various NICs - e1000, tg3, bnx2
Software Environment:
2.6.9, 2.6.18, 2.6.23
Problem Description:
On 2.6.18 found this issue on e1000 and tg3. On mainline 2.6.23-rc* found this
 issue on e100,tgs and bnx2.  It either panicked
at netdevice.h:890 or hung the system, and sometimes depending
on which NIC are used, the following console message,
 e1000:
      "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang"
 tg3:
      "NETDEV WATCHDOG: eth4: transmit timed out"
      "tg3: eth4: transmit timed out, resetting"

Steps to reproduce:
1. On 2.6.18 (both x86_x86_64) insert netconsole module.(NIC: e1000 and tg3)
2. Run a moderate io load , preferably fio - one process doing async+directIO
using libaio 

fio jobfile:
[global]
iodepth=1024
iodepth_batch=60
randrepeat=1
size=1024m
directory=/home/oracle
numjobs=2
[job1]
bs=8k
direct=1
ioengine=libaio
rw=randrw
filename=file1:file2

3. From second console as root do " echo t &gt; /proc/sysrq-trigger"

Machine will instantly hang.


Crash stack captured on 2.6.9
       PANIC: "kernel BUG at include/linux/netdevice.h:888!"
#0 [ 23c5e60] disk_dump at f9ca71a2
#1 [ 23c5e64] printk at 21228d6
#2 [ 23c5e70] freeze_other_cpus at f9ca6ef5
#3 [ 23c5e80] start_disk_dump at f9ca6fa0
#4 [ 23c5e90] try_crashdump at 2133766
#5 [ 23c5e98] die at 2106354
#6 [ 23c5ecc] do_invalid_op at 210672f
#7 [ 23c5f7c] error_code (via invalid_op) at fffecede
   EAX: 00000006  EBX: 00200202  ECX: 00000000  EDX: df287000  EBP: e05ca000
   DS:  007b      ESI: 00000001  ES:  007b      EDI: e05ca240 
   CS:  0060      EIP: f8c82a08  ERR: ffffffff  EFLAGS: 00210046 
#8 [ 23c5fb8] tg3_poll at f8c82a08
#9 [ 23c5fd0] net_rx_action at 227a8da
#10 [ 23c5fe8] __do_softirq at 2126422
--- &lt;soft IRQ&gt; ---
#0 [25c71cac] do_softirq at 2108460
#1 [25c71cb4] dev_queue_xmit at 227a0d2
#2 [25c71ccc] ip_finish_output at 229288d
#3 [25c71ce4] ip_queue_xmit at 2292fa9
#4 [25c71dac] tcp_transmit_skb at 22a0ff7
#5 [25c71dec] tcp_write_xmit at 22a1901
#6 [25c71e10] tcp_sendmsg at 2297d6d
#7 [25c71e80] sock_aio_write at 2272512
#8 [25c71eec] do_sync_write at 215a444
#9 [25c71f88] vfs_write at 215a53a
#10 [25c71fa4] sys_write at 215a5f4
#11 [25c71fc0] system_call at fffec219 

net_device in memory,
  name = "eth0\000\000\000\000\000\000\000\000\000\000\000", 
 ...


Crash stack captured on 2.6.18
       PANIC: "kernel BUG at include/linux/netdevice.h:890!"
 #0 [c072ce30] crash_kexec at c044418a
 #1 [c072ce74] die at c04054d0
 #2 [c072cea4] do_invalid_op at c0405c20
 #3 [c072cf54] error_code (via invalid_op) at c0404ab3
    EAX: 00000007  EBX: 00000202  ECX: 00000000  EDX: f6d9c000  EBP: f6d9c400 
    DS:  007b      ESI: 00000001  ES:  007b      EDI: cb02b280 
    CS:  0060      EIP: f8927791  ERR: ffffffff  EFLAGS: 00010046 
 #4 [c072cf88] tg3_poll at f8927791
--- &lt;soft IRQ&gt; ---
 #0 [f7e54f60] do_softirq at c0406433
 #1 [f7e54f6c] do_IRQ at c0406425
 #2 [f7e54fb4] cpu_idle at c0402c8e

net_device in memory,
  name = "eth4\000\000\000\000\000\000\000\000\000\000\000", 
  name_hlist = {
    next = 0x0, 
    pprev = 0xc07d0148
  }, 
  ...

    </pre>
  </blockquote>
  <pre wrap=""><!---->
OK, but in my 2.6.18, include/linux/netdevice.h:890 is a
local_irq_restore() in netif_rx_complete().  I don't see how that can go
BUG.

Does your 2.6.18 have any patches applied?

Please tell us what is at include/linux/netdevice.h:890 in your 2.6.18
tree.

  </pre>
</blockquote>
Hi, here is the segment of the file,&nbsp; thanks.<br>
&nbsp;&nbsp;&nbsp; 880 /* Remove interface from poll list: it must be in the poll list<br>
&nbsp;&nbsp;&nbsp; 881&nbsp; * on current cpu. This primitive is called by dev-&gt;poll(),
when<br>
&nbsp;&nbsp;&nbsp; 882&nbsp; * it completes the work. The device cannot be out of poll list
at this<br>
&nbsp;&nbsp;&nbsp; 883&nbsp; * moment, it is BUG().<br>
&nbsp;&nbsp;&nbsp; 884&nbsp; */<br>
&nbsp;&nbsp;&nbsp; 885 static inline void netif_rx_complete(struct net_device *dev)<br>
&nbsp;&nbsp;&nbsp; 886 {<br>
&nbsp;&nbsp;&nbsp; 887&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; unsigned long flags;<br>
&nbsp;&nbsp;&nbsp; 888 <br>
&nbsp;&nbsp;&nbsp; 889&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; local_irq_save(flags);<br>
&nbsp;&nbsp;&nbsp; 890&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; BUG_ON(!test_bit(__LINK_STATE_RX_SCHED,
&amp;dev-&gt;state));<br>
&nbsp;&nbsp;&nbsp; 891&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; list_del(&amp;dev-&gt;poll_list);<br>
&nbsp;&nbsp;&nbsp; 892&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; smp_mb__before_clear_bit();<br>
&nbsp;&nbsp;&nbsp; 893&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; clear_bit(__LINK_STATE_RX_SCHED, &amp;dev-&gt;state);<br>
&nbsp;&nbsp;&nbsp; 894&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; local_irq_restore(flags);<br>
&nbsp;&nbsp;&nbsp; 895 }<br>
&nbsp;&nbsp;&nbsp; 896 <br>
<br>
</body>
</html>
Comment 4 Tina Yang 2007-10-04 18:31:06 UTC
Andrew Morton wrote:
> (Please resoind by emailed reply-to-all, not via the bugzilla web interface)
>
> On Thu,  4 Oct 2007 16:24:18 -0700 (PDT)
> bugme-daemon@bugzilla.kernel.org wrote:
>
>   
>> http://bugzilla.kernel.org/show_bug.cgi?id=9124
>>
>>            Summary: Netconsole race crashed the system
>>            Product: Networking
>>            Version: 2.5
>>      KernelVersion: 2.6.9, 2.6.18, 2.6.23
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: high
>>           Priority: P1
>>          Component: Other
>>         AssignedTo: acme@ghostprotocols.net
>>         ReportedBy: tina.yang@oracle.com
>>
>>
>> Most recent kernel where this bug did not occur:
>> Think the problem has always been there.
>> Distribution:
>> Hardware Environment:
>> DELL PowerEdge 2650 (x86)
>> DELL PowerEdge 2850(x86_64)
>> HP ProLiant DL380 G5 (x86_64) 
>> with various NICs - e1000, tg3, bnx2
>> Software Environment:
>> 2.6.9, 2.6.18, 2.6.23
>> Problem Description:
>> On 2.6.18 found this issue on e1000 and tg3. On mainline 2.6.23-rc* found
>> this
>>  issue on e100,tgs and bnx2.  It either panicked
>> at netdevice.h:890 or hung the system, and sometimes depending
>> on which NIC are used, the following console message,
>>  e1000:
>>       "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang"
>>  tg3:
>>       "NETDEV WATCHDOG: eth4: transmit timed out"
>>       "tg3: eth4: transmit timed out, resetting"
>>
>> Steps to reproduce:
>> 1. On 2.6.18 (both x86_x86_64) insert netconsole module.(NIC: e1000 and tg3)
>> 2. Run a moderate io load , preferably fio - one process doing
>> async+directIO
>> using libaio 
>>
>> fio jobfile:
>> [global]
>> iodepth=1024
>> iodepth_batch=60
>> randrepeat=1
>> size=1024m
>> directory=/home/oracle
>> numjobs=2
>> [job1]
>> bs=8k
>> direct=1
>> ioengine=libaio
>> rw=randrw
>> filename=file1:file2
>>
>> 3. From second console as root do " echo t > /proc/sysrq-trigger"
>>
>> Machine will instantly hang.
>>
>>
>> Crash stack captured on 2.6.9
>>        PANIC: "kernel BUG at include/linux/netdevice.h:888!"
>> #0 [ 23c5e60] disk_dump at f9ca71a2
>> #1 [ 23c5e64] printk at 21228d6
>> #2 [ 23c5e70] freeze_other_cpus at f9ca6ef5
>> #3 [ 23c5e80] start_disk_dump at f9ca6fa0
>> #4 [ 23c5e90] try_crashdump at 2133766
>> #5 [ 23c5e98] die at 2106354
>> #6 [ 23c5ecc] do_invalid_op at 210672f
>> #7 [ 23c5f7c] error_code (via invalid_op) at fffecede
>>    EAX: 00000006  EBX: 00200202  ECX: 00000000  EDX: df287000  EBP: e05ca000
>>    DS:  007b      ESI: 00000001  ES:  007b      EDI: e05ca240 
>>    CS:  0060      EIP: f8c82a08  ERR: ffffffff  EFLAGS: 00210046 
>> #8 [ 23c5fb8] tg3_poll at f8c82a08
>> #9 [ 23c5fd0] net_rx_action at 227a8da
>> #10 [ 23c5fe8] __do_softirq at 2126422
>> --- <soft IRQ> ---
>> #0 [25c71cac] do_softirq at 2108460
>> #1 [25c71cb4] dev_queue_xmit at 227a0d2
>> #2 [25c71ccc] ip_finish_output at 229288d
>> #3 [25c71ce4] ip_queue_xmit at 2292fa9
>> #4 [25c71dac] tcp_transmit_skb at 22a0ff7
>> #5 [25c71dec] tcp_write_xmit at 22a1901
>> #6 [25c71e10] tcp_sendmsg at 2297d6d
>> #7 [25c71e80] sock_aio_write at 2272512
>> #8 [25c71eec] do_sync_write at 215a444
>> #9 [25c71f88] vfs_write at 215a53a
>> #10 [25c71fa4] sys_write at 215a5f4
>> #11 [25c71fc0] system_call at fffec219 
>>
>> net_device in memory,
>>   name = "eth0\000\000\000\000\000\000\000\000\000\000\000", 
>>  ...
>>
>>
>> Crash stack captured on 2.6.18
>>        PANIC: "kernel BUG at include/linux/netdevice.h:890!"
>>  #0 [c072ce30] crash_kexec at c044418a
>>  #1 [c072ce74] die at c04054d0
>>  #2 [c072cea4] do_invalid_op at c0405c20
>>  #3 [c072cf54] error_code (via invalid_op) at c0404ab3
>>     EAX: 00000007  EBX: 00000202  ECX: 00000000  EDX: f6d9c000  EBP:
>>     f6d9c400 
>>     DS:  007b      ESI: 00000001  ES:  007b      EDI: cb02b280 
>>     CS:  0060      EIP: f8927791  ERR: ffffffff  EFLAGS: 00010046 
>>  #4 [c072cf88] tg3_poll at f8927791
>> --- <soft IRQ> ---
>>  #0 [f7e54f60] do_softirq at c0406433
>>  #1 [f7e54f6c] do_IRQ at c0406425
>>  #2 [f7e54fb4] cpu_idle at c0402c8e
>>
>> net_device in memory,
>>   name = "eth4\000\000\000\000\000\000\000\000\000\000\000", 
>>   name_hlist = {
>>     next = 0x0, 
>>     pprev = 0xc07d0148
>>   }, 
>>   ...
>>
>>     
>
> OK, but in my 2.6.18, include/linux/netdevice.h:890 is a
> local_irq_restore() in netif_rx_complete().  I don't see how that can go
> BUG.
>
> Does your 2.6.18 have any patches applied?
>
> Please tell us what is at include/linux/netdevice.h:890 in your 2.6.18
> tree.
>
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   

    netdevice.h attached.
    890         BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
   
/*
 * INET		An implementation of the TCP/IP protocol suite for the LINUX
 *		operating system.  INET is implemented using the  BSD Socket
 *		interface as the means of communication with the user level.
 *
 *		Definitions for the Interfaces handler.
 *
 * Version:	@(#)dev.h	1.0.10	08/12/93
 *
 * Authors:	Ross Biro
 *		Fred N. van Kempen, <waltje@uWalt.NL.Mugnet.ORG>
 *		Corey Minyard <wf-rch!minyard@relay.EU.net>
 *		Donald J. Becker, <becker@cesdis.gsfc.nasa.gov>
 *		Alan Cox, <Alan.Cox@linux.org>
 *		Bjorn Ekwall. <bj0rn@blox.se>
 *              Pekka Riikonen <priikone@poseidon.pspt.fi>
 *
 *		This program is free software; you can redistribute it and/or
 *		modify it under the terms of the GNU General Public License
 *		as published by the Free Software Foundation; either version
 *		2 of the License, or (at your option) any later version.
 *
 *		Moved to /usr/include/linux for NET3
 */
#ifndef _LINUX_NETDEVICE_H
#define _LINUX_NETDEVICE_H

#include <linux/if.h>
#include <linux/if_ether.h>
#include <linux/if_packet.h>

#ifdef __KERNEL__
#include <asm/atomic.h>
#include <asm/cache.h>
#include <asm/byteorder.h>

#include <linux/device.h>
#include <linux/percpu.h>
#include <linux/dmaengine.h>

struct divert_blk;
struct vlan_group;
struct ethtool_ops;
struct netpoll_info;
					/* source back-compat hooks */
#define SET_ETHTOOL_OPS(netdev,ops) \
	( (netdev)->ethtool_ops = (ops) )

#define HAVE_ALLOC_NETDEV		/* feature macro: alloc_xxxdev
					   functions are available. */
#define HAVE_FREE_NETDEV		/* free_netdev() */
#define HAVE_NETDEV_PRIV		/* netdev_priv() */

#define NET_XMIT_SUCCESS	0
#define NET_XMIT_DROP		1	/* skb dropped			*/
#define NET_XMIT_CN		2	/* congestion notification	*/
#define NET_XMIT_POLICED	3	/* skb is shot by police	*/
#define NET_XMIT_BYPASS		4	/* packet does not leave via dequeue;
					   (TC use only - dev_queue_xmit
					   returns this as NET_XMIT_SUCCESS) */

/* Backlog congestion levels */
#define NET_RX_SUCCESS		0   /* keep 'em coming, baby */
#define NET_RX_DROP		1  /* packet dropped */
#define NET_RX_CN_LOW		2   /* storm alert, just in case */
#define NET_RX_CN_MOD		3   /* Storm on its way! */
#define NET_RX_CN_HIGH		4   /* The storm is here */
#define NET_RX_BAD		5  /* packet dropped due to kernel error */

#define net_xmit_errno(e)	((e) != NET_XMIT_CN ? -ENOBUFS : 0)

#endif

#define MAX_ADDR_LEN	32		/* Largest hardware address length */

/* Driver transmit return codes */
#define NETDEV_TX_OK 0		/* driver took care of packet */
#define NETDEV_TX_BUSY 1	/* driver tx path was busy*/
#define NETDEV_TX_LOCKED -1	/* driver tx lock was already taken */

/*
 *	Compute the worst case header length according to the protocols
 *	used.
 */
 
#if !defined(CONFIG_AX25) && !defined(CONFIG_AX25_MODULE) && !defined(CONFIG_TR)
#define LL_MAX_HEADER	32
#else
#if defined(CONFIG_AX25) || defined(CONFIG_AX25_MODULE)
#define LL_MAX_HEADER	96
#else
#define LL_MAX_HEADER	48
#endif
#endif

#if !defined(CONFIG_NET_IPIP) && \
    !defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE)
#define MAX_HEADER LL_MAX_HEADER
#else
#define MAX_HEADER (LL_MAX_HEADER + 48)
#endif

/*
 *	Network device statistics. Akin to the 2.0 ether stats but
 *	with byte counters.
 */
 
struct net_device_stats
{
	unsigned long	rx_packets;		/* total packets received	*/
	unsigned long	tx_packets;		/* total packets transmitted	*/
	unsigned long	rx_bytes;		/* total bytes received 	*/
	unsigned long	tx_bytes;		/* total bytes transmitted	*/
	unsigned long	rx_errors;		/* bad packets received		*/
	unsigned long	tx_errors;		/* packet transmit problems	*/
	unsigned long	rx_dropped;		/* no space in linux buffers	*/
	unsigned long	tx_dropped;		/* no space available in linux	*/
	unsigned long	multicast;		/* multicast packets received	*/
	unsigned long	collisions;

	/* detailed rx_errors: */
	unsigned long	rx_length_errors;
	unsigned long	rx_over_errors;		/* receiver ring buff overflow	*/
	unsigned long	rx_crc_errors;		/* recved pkt with crc error	*/
	unsigned long	rx_frame_errors;	/* recv'd frame alignment error */
	unsigned long	rx_fifo_errors;		/* recv'r fifo overrun		*/
	unsigned long	rx_missed_errors;	/* receiver missed packet	*/

	/* detailed tx_errors */
	unsigned long	tx_aborted_errors;
	unsigned long	tx_carrier_errors;
	unsigned long	tx_fifo_errors;
	unsigned long	tx_heartbeat_errors;
	unsigned long	tx_window_errors;
	
	/* for cslip etc */
	unsigned long	rx_compressed;
	unsigned long	tx_compressed;
};


/* Media selection options. */
enum {
        IF_PORT_UNKNOWN = 0,
        IF_PORT_10BASE2,
        IF_PORT_10BASET,
        IF_PORT_AUI,
        IF_PORT_100BASET,
        IF_PORT_100BASETX,
        IF_PORT_100BASEFX
};

#ifdef __KERNEL__

#include <linux/cache.h>
#include <linux/skbuff.h>

struct neighbour;
struct neigh_parms;
struct sk_buff;

struct netif_rx_stats
{
	unsigned total;
	unsigned dropped;
	unsigned time_squeeze;
	unsigned cpu_collision;
};

DECLARE_PER_CPU(struct netif_rx_stats, netdev_rx_stat);


/*
 *	We tag multicasts with these structures.
 */
 
struct dev_mc_list
{	
	struct dev_mc_list	*next;
	__u8			dmi_addr[MAX_ADDR_LEN];
	unsigned char		dmi_addrlen;
	int			dmi_users;
	int			dmi_gusers;
};

struct hh_cache
{
	struct hh_cache *hh_next;	/* Next entry			     */
	atomic_t	hh_refcnt;	/* number of users                   */
	unsigned short  hh_type;	/* protocol identifier, f.e ETH_P_IP
                                         *  NOTE:  For VLANs, this will be the
                                         *  encapuslated type. --BLG
                                         */
	int		hh_len;		/* length of header */
	int		(*hh_output)(struct sk_buff *skb);
	rwlock_t	hh_lock;

	/* cached hardware header; allow for machine alignment needs.        */
#define HH_DATA_MOD	16
#define HH_DATA_OFF(__len) \
	(HH_DATA_MOD - (((__len - 1) & (HH_DATA_MOD - 1)) + 1))
#define HH_DATA_ALIGN(__len) \
	(((__len)+(HH_DATA_MOD-1))&~(HH_DATA_MOD - 1))
	unsigned long	hh_data[HH_DATA_ALIGN(LL_MAX_HEADER) / sizeof(long)];
};

/* Reserve HH_DATA_MOD byte aligned hard_header_len, but at least that much.
 * Alternative is:
 *   dev->hard_header_len ? (dev->hard_header_len +
 *                           (HH_DATA_MOD - 1)) & ~(HH_DATA_MOD - 1) : 0
 *
 * We could use other alignment values, but we must maintain the
 * relationship HH alignment <= LL alignment.
 */
#define LL_RESERVED_SPACE(dev) \
	(((dev)->hard_header_len&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
#define LL_RESERVED_SPACE_EXTRA(dev,extra) \
	((((dev)->hard_header_len+extra)&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)

/* These flag bits are private to the generic network queueing
 * layer, they may not be explicitly referenced by any other
 * code.
 */

enum netdev_state_t
{
	__LINK_STATE_XOFF=0,
	__LINK_STATE_START,
	__LINK_STATE_PRESENT,
	__LINK_STATE_SCHED,
	__LINK_STATE_NOCARRIER,
	__LINK_STATE_RX_SCHED,
	__LINK_STATE_LINKWATCH_PENDING,
	__LINK_STATE_DORMANT,
	__LINK_STATE_QDISC_RUNNING,
};


/*
 * This structure holds at boot time configured netdevice settings. They
 * are then used in the device probing. 
 */
struct netdev_boot_setup {
	char name[IFNAMSIZ];
	struct ifmap map;
};
#define NETDEV_BOOT_SETUP_MAX 8

extern int __init netdev_boot_setup(char *str);

/*
 *	The DEVICE structure.
 *	Actually, this whole structure is a big mistake.  It mixes I/O
 *	data with strictly "high-level" data, and it has to know about
 *	almost every data structure used in the INET module.
 *
 *	FIXME: cleanup struct net_device such that network protocol info
 *	moves out.
 */

struct net_device
{

	/*
	 * This is the first field of the "visible" part of this structure
	 * (i.e. as seen by users in the "Space.c" file).  It is the name
	 * the interface.
	 */
	char			name[IFNAMSIZ];
	/* device name hash chain */
	struct hlist_node	name_hlist;

	/*
	 *	I/O specific fields
	 *	FIXME: Merge these and struct ifmap into one
	 */
	unsigned long		mem_end;	/* shared mem end	*/
	unsigned long		mem_start;	/* shared mem start	*/
	unsigned long		base_addr;	/* device I/O address	*/
	unsigned int		irq;		/* device IRQ number	*/

	/*
	 *	Some hardware also needs these fields, but they are not
	 *	part of the usual set specified in Space.c.
	 */

	unsigned char		if_port;	/* Selectable AUI, TP,..*/
	unsigned char		dma;		/* DMA channel		*/

	unsigned long		state;

	struct net_device	*next;
	
	/* The device initialization function. Called only once. */
	int			(*init)(struct net_device *dev);

	/* ------- Fields preinitialized in Space.c finish here ------- */

	/* Net device features */
	unsigned long		features;
#define NETIF_F_SG		1	/* Scatter/gather IO. */
#define NETIF_F_IP_CSUM		2	/* Can checksum only TCP/UDP over IPv4. */
#define NETIF_F_NO_CSUM		4	/* Does not require checksum. F.e. loopack. */
#define NETIF_F_HW_CSUM		8	/* Can checksum all the packets. */
#define NETIF_F_HIGHDMA		32	/* Can DMA to high memory. */
#define NETIF_F_FRAGLIST	64	/* Scatter/gather IO. */
#define NETIF_F_HW_VLAN_TX	128	/* Transmit VLAN hw acceleration */
#define NETIF_F_HW_VLAN_RX	256	/* Receive VLAN hw acceleration */
#define NETIF_F_HW_VLAN_FILTER	512	/* Receive filtering on VLAN */
#define NETIF_F_VLAN_CHALLENGED	1024	/* Device cannot handle VLAN packets */
#define NETIF_F_GSO		2048	/* Enable software GSO. */
#define NETIF_F_LLTX		4096	/* LockLess TX */

	/* Segmentation offload features */
#define NETIF_F_GSO_SHIFT	16
#define NETIF_F_GSO_MASK	0xffff0000
#define NETIF_F_TSO		(SKB_GSO_TCPV4 << NETIF_F_GSO_SHIFT)
#define NETIF_F_UFO		(SKB_GSO_UDP << NETIF_F_GSO_SHIFT)
#define NETIF_F_GSO_ROBUST	(SKB_GSO_DODGY << NETIF_F_GSO_SHIFT)
#define NETIF_F_TSO_ECN		(SKB_GSO_TCP_ECN << NETIF_F_GSO_SHIFT)
#define NETIF_F_TSO6		(SKB_GSO_TCPV6 << NETIF_F_GSO_SHIFT)

	/* List of features with software fallbacks. */
#define NETIF_F_GSO_SOFTWARE	(NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6)

#define NETIF_F_GEN_CSUM	(NETIF_F_NO_CSUM | NETIF_F_HW_CSUM)
#define NETIF_F_ALL_CSUM	(NETIF_F_IP_CSUM | NETIF_F_GEN_CSUM)

	struct net_device	*next_sched;

	/* Interface index. Unique device identifier	*/
	int			ifindex;
	int			iflink;


	struct net_device_stats* (*get_stats)(struct net_device *dev);
	struct iw_statistics*	(*get_wireless_stats)(struct net_device *dev);

	/* List of functions to handle Wireless Extensions (instead of ioctl).
	 * See <net/iw_handler.h> for details. Jean II */
	const struct iw_handler_def *	wireless_handlers;
	/* Instance data managed by the core of Wireless Extensions. */
	struct iw_public_data *	wireless_data;

	/* pending config used by cfg80211/wext compat code only */
	void *cfg80211_wext_pending_config;

	struct ethtool_ops *ethtool_ops;

	/*
	 * This marks the end of the "visible" part of the structure. All
	 * fields hereafter are internal to the system, and may change at
	 * will (read: may be cleaned up at will).
	 */


	unsigned int		flags;	/* interface flags (a la BSD)	*/
	unsigned short		gflags;
        unsigned short          priv_flags; /* Like 'flags' but invisible to userspace. */
	unsigned short		padded;	/* How much padding added by alloc_netdev() */

	unsigned char		operstate; /* RFC2863 operstate */
	unsigned char		link_mode; /* mapping policy to operstate */

	unsigned		mtu;	/* interface MTU value		*/
	unsigned short		type;	/* interface hardware type	*/
	unsigned short		hard_header_len;	/* hardware hdr length	*/

	struct net_device	*master; /* Pointer to master device of a group,
					  * which this device is member of.
					  */

	/* Interface address info. */
	unsigned char		perm_addr[MAX_ADDR_LEN]; /* permanent hw address */
	unsigned char		addr_len;	/* hardware address length	*/
	unsigned short          dev_id;		/* for shared network cards */

	struct dev_mc_list	*mc_list;	/* Multicast mac addresses	*/
	int			mc_count;	/* Number of installed mcasts	*/
	int			promiscuity;
	int			allmulti;


	/* Protocol specific pointers */
	
	void 			*atalk_ptr;	/* AppleTalk link 	*/
	void			*ip_ptr;	/* IPv4 specific data	*/  
	void                    *dn_ptr;        /* DECnet specific data */
	void                    *ip6_ptr;       /* IPv6 specific data */
	void			*ec_ptr;	/* Econet specific data	*/
	void			*ax25_ptr;	/* AX.25 specific data */
	void			*ieee80211_ptr;	/* IEEE 802.11 specific data */

/*
 * Cache line mostly used on receive path (including eth_type_trans())
 */
	struct list_head	poll_list ____cacheline_aligned_in_smp;
					/* Link to poll list	*/

	int			(*poll) (struct net_device *dev, int *quota);
	int			quota;
	int			weight;
	unsigned long		last_rx;	/* Time of last Rx	*/
	/* Interface address info used in eth_type_trans() */
	unsigned char		dev_addr[MAX_ADDR_LEN];	/* hw address, (before bcast 
							because most packets are unicast) */

	unsigned char		broadcast[MAX_ADDR_LEN];	/* hw bcast add	*/

/*
 * Cache line mostly used on queue transmit path (qdisc)
 */
	/* device queue lock */
	spinlock_t		queue_lock ____cacheline_aligned_in_smp;
	struct Qdisc		*qdisc;
	struct Qdisc		*qdisc_sleeping;
	struct list_head	qdisc_list;
	unsigned long		tx_queue_len;	/* Max frames per queue allowed */

	/* Partially transmitted GSO packet. */
	struct sk_buff		*gso_skb;

	/* ingress path synchronizer */
	spinlock_t		ingress_lock;
	struct Qdisc		*qdisc_ingress;

/*
 * One part is mostly used on xmit path (device)
 */
	/* hard_start_xmit synchronizer */
	spinlock_t		_xmit_lock ____cacheline_aligned_in_smp;
	/* cpu id of processor entered to hard_start_xmit or -1,
	   if nobody entered there.
	 */
	int			xmit_lock_owner;
	void			*priv;	/* pointer to private data	*/
	int			(*hard_start_xmit) (struct sk_buff *skb,
						    struct net_device *dev);
	/* These may be needed for future network-power-down code. */
	unsigned long		trans_start;	/* Time (in jiffies) of last Tx	*/

	int			watchdog_timeo; /* used by dev_watchdog() */
	struct timer_list	watchdog_timer;

/*
 * refcnt is a very hot point, so align it on SMP
 */
	/* Number of references to this device */
	atomic_t		refcnt ____cacheline_aligned_in_smp;

	/* delayed register/unregister */
	struct list_head	todo_list;
	/* device index hash chain */
	struct hlist_node	index_hlist;

	/* register/unregister state machine */
	enum { NETREG_UNINITIALIZED=0,
	       NETREG_REGISTERED,	/* completed register_netdevice */
	       NETREG_UNREGISTERING,	/* called unregister_netdevice */
	       NETREG_UNREGISTERED,	/* completed unregister todo */
	       NETREG_RELEASED,		/* called free_netdev */
	} reg_state;

	/* Called after device is detached from network. */
	void			(*uninit)(struct net_device *dev);
	/* Called after last user reference disappears. */
	void			(*destructor)(struct net_device *dev);

	/* Pointers to interface service routines.	*/
	int			(*open)(struct net_device *dev);
	int			(*stop)(struct net_device *dev);
#define HAVE_NETDEV_POLL
	int			(*hard_header) (struct sk_buff *skb,
						struct net_device *dev,
						unsigned short type,
						void *daddr,
						void *saddr,
						unsigned len);
	int			(*rebuild_header)(struct sk_buff *skb);
#define HAVE_MULTICAST			 
	void			(*set_multicast_list)(struct net_device *dev);
#define HAVE_SET_MAC_ADDR  		 
	int			(*set_mac_address)(struct net_device *dev,
						   void *addr);
#define HAVE_PRIVATE_IOCTL
	int			(*do_ioctl)(struct net_device *dev,
					    struct ifreq *ifr, int cmd);
#define HAVE_SET_CONFIG
	int			(*set_config)(struct net_device *dev,
					      struct ifmap *map);
#define HAVE_HEADER_CACHE
	int			(*hard_header_cache)(struct neighbour *neigh,
						     struct hh_cache *hh);
	void			(*header_cache_update)(struct hh_cache *hh,
						       struct net_device *dev,
						       unsigned char *  haddr);
#define HAVE_CHANGE_MTU
	int			(*change_mtu)(struct net_device *dev, int new_mtu);

#define HAVE_TX_TIMEOUT
	void			(*tx_timeout) (struct net_device *dev);

	void			(*vlan_rx_register)(struct net_device *dev,
						    struct vlan_group *grp);
	void			(*vlan_rx_add_vid)(struct net_device *dev,
						   unsigned short vid);
	void			(*vlan_rx_kill_vid)(struct net_device *dev,
						    unsigned short vid);

	int			(*hard_header_parse)(struct sk_buff *skb,
						     unsigned char *haddr);
	int			(*neigh_setup)(struct net_device *dev, struct neigh_parms *);
#ifdef CONFIG_NETPOLL
	struct netpoll_info	*npinfo;
#endif
#ifdef CONFIG_NET_POLL_CONTROLLER
	void                    (*poll_controller)(struct net_device *dev);
#endif

	/* bridge stuff */
	struct net_bridge_port	*br_port;

#ifdef CONFIG_NET_DIVERT
	/* this will get initialized at each interface type init routine */
	struct divert_blk	*divert;
#endif /* CONFIG_NET_DIVERT */

	/* class/net/name entry */
	struct class_device	class_dev;
	/* space for optional statistics and wireless sysfs groups */
	struct attribute_group  *sysfs_groups[3];
};

#define	NETDEV_ALIGN		32
#define	NETDEV_ALIGN_CONST	(NETDEV_ALIGN - 1)

static inline void *netdev_priv(struct net_device *dev)
{
	return (char *)dev + ((sizeof(struct net_device)
					+ NETDEV_ALIGN_CONST)
				& ~NETDEV_ALIGN_CONST);
}

#define SET_MODULE_OWNER(dev) do { } while (0)
/* Set the sysfs physical device reference for the network logical device
 * if set prior to registration will cause a symlink during initialization.
 */
#define SET_NETDEV_DEV(net, pdev)	((net)->class_dev.dev = (pdev))

struct packet_type {
	__be16			type;	/* This is really htons(ether_type). */
	struct net_device	*dev;	/* NULL is wildcarded here	     */
	int			(*func) (struct sk_buff *,
					 struct net_device *,
					 struct packet_type *,
					 struct net_device *);
	struct sk_buff		*(*gso_segment)(struct sk_buff *skb,
						int features);
	int			(*gso_send_check)(struct sk_buff *skb);
	void			*af_packet_priv;
	struct list_head	list;
};

#include <linux/interrupt.h>
#include <linux/notifier.h>

extern struct net_device		loopback_dev;		/* The loopback */
extern struct net_device		*dev_base;		/* All devices */
extern rwlock_t				dev_base_lock;		/* Device list lock */

extern int 			netdev_boot_setup_check(struct net_device *dev);
extern unsigned long		netdev_boot_base(const char *prefix, int unit);
extern struct net_device    *dev_getbyhwaddr(unsigned short type, char *hwaddr);
extern struct net_device *dev_getfirstbyhwtype(unsigned short type);
extern void		dev_add_pack(struct packet_type *pt);
extern void		dev_remove_pack(struct packet_type *pt);
extern void		__dev_remove_pack(struct packet_type *pt);

extern struct net_device	*dev_get_by_flags(unsigned short flags,
						  unsigned short mask);
extern struct net_device	*dev_get_by_name(const char *name);
extern struct net_device	*__dev_get_by_name(const char *name);
extern int		dev_alloc_name(struct net_device *dev, const char *name);
extern int		dev_open(struct net_device *dev);
extern int		dev_close(struct net_device *dev);
extern int		dev_queue_xmit(struct sk_buff *skb);
extern int		register_netdevice(struct net_device *dev);
extern int		unregister_netdevice(struct net_device *dev);
extern void		free_netdev(struct net_device *dev);
extern void		synchronize_net(void);
extern int 		register_netdevice_notifier(struct notifier_block *nb);
extern int		unregister_netdevice_notifier(struct notifier_block *nb);
extern int		call_netdevice_notifiers(unsigned long val, void *v);
extern struct net_device	*dev_get_by_index(int ifindex);
extern struct net_device	*__dev_get_by_index(int ifindex);
extern int		dev_restart(struct net_device *dev);
#ifdef CONFIG_NETPOLL_TRAP
extern int		netpoll_trap(void);
#endif

typedef int gifconf_func_t(struct net_device * dev, char __user * bufptr, int len);
extern int		register_gifconf(unsigned int family, gifconf_func_t * gifconf);
static inline int unregister_gifconf(unsigned int family)
{
	return register_gifconf(family, NULL);
}

/*
 * Incoming packets are placed on per-cpu queues so that
 * no locking is needed.
 */

struct softnet_data
{
	struct net_device	*output_queue;
	struct sk_buff_head	input_pkt_queue;
	struct list_head	poll_list;
	struct sk_buff		*completion_queue;

	struct net_device	backlog_dev;	/* Sorry. 8) */
#ifdef CONFIG_NET_DMA
	struct dma_chan		*net_dma;
#endif
};

DECLARE_PER_CPU(struct softnet_data,softnet_data);

#define HAVE_NETIF_QUEUE

extern void __netif_schedule(struct net_device *dev);

static inline void netif_schedule(struct net_device *dev)
{
	if (!test_bit(__LINK_STATE_XOFF, &dev->state))
		__netif_schedule(dev);
}

static inline void netif_start_queue(struct net_device *dev)
{
	clear_bit(__LINK_STATE_XOFF, &dev->state);
}

static inline void netif_wake_queue(struct net_device *dev)
{
#ifdef CONFIG_NETPOLL_TRAP
	if (netpoll_trap())
		return;
#endif
	if (test_and_clear_bit(__LINK_STATE_XOFF, &dev->state))
		__netif_schedule(dev);
}

static inline void netif_stop_queue(struct net_device *dev)
{
#ifdef CONFIG_NETPOLL_TRAP
	if (netpoll_trap())
		return;
#endif
	set_bit(__LINK_STATE_XOFF, &dev->state);
}

static inline int netif_queue_stopped(const struct net_device *dev)
{
	return test_bit(__LINK_STATE_XOFF, &dev->state);
}

static inline int netif_running(const struct net_device *dev)
{
	return test_bit(__LINK_STATE_START, &dev->state);
}


/* Use this variant when it is known for sure that it
 * is executing from interrupt context.
 */
static inline void dev_kfree_skb_irq(struct sk_buff *skb)
{
	if (atomic_dec_and_test(&skb->users)) {
		struct softnet_data *sd;
		unsigned long flags;

		local_irq_save(flags);
		sd = &__get_cpu_var(softnet_data);
		skb->next = sd->completion_queue;
		sd->completion_queue = skb;
		raise_softirq_irqoff(NET_TX_SOFTIRQ);
		local_irq_restore(flags);
	}
}

/* Use this variant in places where it could be invoked
 * either from interrupt or non-interrupt context.
 */
extern void dev_kfree_skb_any(struct sk_buff *skb);

#define HAVE_NETIF_RX 1
extern int		netif_rx(struct sk_buff *skb);
extern int		netif_rx_ni(struct sk_buff *skb);
#define HAVE_NETIF_RECEIVE_SKB 1
extern int		netif_receive_skb(struct sk_buff *skb);
extern int		dev_valid_name(const char *name);
extern int		dev_ioctl(unsigned int cmd, void __user *);
extern int		dev_ethtool(struct ifreq *);
extern unsigned		dev_get_flags(const struct net_device *);
extern int		dev_change_flags(struct net_device *, unsigned);
extern int		dev_change_name(struct net_device *, char *);
extern int		dev_set_mtu(struct net_device *, int);
extern int		dev_set_mac_address(struct net_device *,
					    struct sockaddr *);
extern int		dev_hard_start_xmit(struct sk_buff *skb,
					    struct net_device *dev);

extern void		dev_init(void);

extern int		netdev_budget;

/* Called by rtnetlink.c:rtnl_unlock() */
extern void netdev_run_todo(void);

static inline void dev_put(struct net_device *dev)
{
	atomic_dec(&dev->refcnt);
}

static inline void dev_hold(struct net_device *dev)
{
	atomic_inc(&dev->refcnt);
}

/* Carrier loss detection, dial on demand. The functions netif_carrier_on
 * and _off may be called from IRQ context, but it is caller
 * who is responsible for serialization of these calls.
 *
 * The name carrier is inappropriate, these functions should really be
 * called netif_lowerlayer_*() because they represent the state of any
 * kind of lower layer not just hardware media.
 */

extern void linkwatch_fire_event(struct net_device *dev);

static inline int netif_carrier_ok(const struct net_device *dev)
{
	return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
}

extern void __netdev_watchdog_up(struct net_device *dev);

extern void netif_carrier_on(struct net_device *dev);

extern void netif_carrier_off(struct net_device *dev);

static inline void netif_dormant_on(struct net_device *dev)
{
	if (!test_and_set_bit(__LINK_STATE_DORMANT, &dev->state))
		linkwatch_fire_event(dev);
}

static inline void netif_dormant_off(struct net_device *dev)
{
	if (test_and_clear_bit(__LINK_STATE_DORMANT, &dev->state))
		linkwatch_fire_event(dev);
}

static inline int netif_dormant(const struct net_device *dev)
{
	return test_bit(__LINK_STATE_DORMANT, &dev->state);
}


static inline int netif_oper_up(const struct net_device *dev) {
	return (dev->operstate == IF_OPER_UP ||
		dev->operstate == IF_OPER_UNKNOWN /* backward compat */);
}

/* Hot-plugging. */
static inline int netif_device_present(struct net_device *dev)
{
	return test_bit(__LINK_STATE_PRESENT, &dev->state);
}

extern void netif_device_detach(struct net_device *dev);

extern void netif_device_attach(struct net_device *dev);

/*
 * Network interface message level settings
 */
#define HAVE_NETIF_MSG 1

enum {
	NETIF_MSG_DRV		= 0x0001,
	NETIF_MSG_PROBE		= 0x0002,
	NETIF_MSG_LINK		= 0x0004,
	NETIF_MSG_TIMER		= 0x0008,
	NETIF_MSG_IFDOWN	= 0x0010,
	NETIF_MSG_IFUP		= 0x0020,
	NETIF_MSG_RX_ERR	= 0x0040,
	NETIF_MSG_TX_ERR	= 0x0080,
	NETIF_MSG_TX_QUEUED	= 0x0100,
	NETIF_MSG_INTR		= 0x0200,
	NETIF_MSG_TX_DONE	= 0x0400,
	NETIF_MSG_RX_STATUS	= 0x0800,
	NETIF_MSG_PKTDATA	= 0x1000,
	NETIF_MSG_HW		= 0x2000,
	NETIF_MSG_WOL		= 0x4000,
};

#define netif_msg_drv(p)	((p)->msg_enable & NETIF_MSG_DRV)
#define netif_msg_probe(p)	((p)->msg_enable & NETIF_MSG_PROBE)
#define netif_msg_link(p)	((p)->msg_enable & NETIF_MSG_LINK)
#define netif_msg_timer(p)	((p)->msg_enable & NETIF_MSG_TIMER)
#define netif_msg_ifdown(p)	((p)->msg_enable & NETIF_MSG_IFDOWN)
#define netif_msg_ifup(p)	((p)->msg_enable & NETIF_MSG_IFUP)
#define netif_msg_rx_err(p)	((p)->msg_enable & NETIF_MSG_RX_ERR)
#define netif_msg_tx_err(p)	((p)->msg_enable & NETIF_MSG_TX_ERR)
#define netif_msg_tx_queued(p)	((p)->msg_enable & NETIF_MSG_TX_QUEUED)
#define netif_msg_intr(p)	((p)->msg_enable & NETIF_MSG_INTR)
#define netif_msg_tx_done(p)	((p)->msg_enable & NETIF_MSG_TX_DONE)
#define netif_msg_rx_status(p)	((p)->msg_enable & NETIF_MSG_RX_STATUS)
#define netif_msg_pktdata(p)	((p)->msg_enable & NETIF_MSG_PKTDATA)
#define netif_msg_hw(p)		((p)->msg_enable & NETIF_MSG_HW)
#define netif_msg_wol(p)	((p)->msg_enable & NETIF_MSG_WOL)

static inline u32 netif_msg_init(int debug_value, int default_msg_enable_bits)
{
	/* use default */
	if (debug_value < 0 || debug_value >= (sizeof(u32) * 8))
		return default_msg_enable_bits;
	if (debug_value == 0)	/* no output */
		return 0;
	/* set low N bits */
	return (1 << debug_value) - 1;
}

/* Test if receive needs to be scheduled */
static inline int __netif_rx_schedule_prep(struct net_device *dev)
{
	return !test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state);
}

/* Test if receive needs to be scheduled but only if up */
static inline int netif_rx_schedule_prep(struct net_device *dev)
{
	return netif_running(dev) && __netif_rx_schedule_prep(dev);
}

/* Add interface to tail of rx poll list. This assumes that _prep has
 * already been called and returned 1.
 */

extern void __netif_rx_schedule(struct net_device *dev);

/* Try to reschedule poll. Called by irq handler. */

static inline void netif_rx_schedule(struct net_device *dev)
{
	if (netif_rx_schedule_prep(dev))
		__netif_rx_schedule(dev);
}

/* Try to reschedule poll. Called by dev->poll() after netif_rx_complete().
 * Do not inline this?
 */
static inline int netif_rx_reschedule(struct net_device *dev, int undo)
{
	if (netif_rx_schedule_prep(dev)) {
		unsigned long flags;

		dev->quota += undo;

		local_irq_save(flags);
		list_add_tail(&dev->poll_list, &__get_cpu_var(softnet_data).poll_list);
		__raise_softirq_irqoff(NET_RX_SOFTIRQ);
		local_irq_restore(flags);
		return 1;
	}
	return 0;
}

/* Remove interface from poll list: it must be in the poll list
 * on current cpu. This primitive is called by dev->poll(), when
 * it completes the work. The device cannot be out of poll list at this
 * moment, it is BUG().
 */
static inline void netif_rx_complete(struct net_device *dev)
{
	unsigned long flags;

	local_irq_save(flags);
	BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
	list_del(&dev->poll_list);
	smp_mb__before_clear_bit();
	clear_bit(__LINK_STATE_RX_SCHED, &dev->state);
	local_irq_restore(flags);
}

static inline void netif_poll_disable(struct net_device *dev)
{
	while (test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state))
		/* No hurry. */
		schedule_timeout_interruptible(1);
}

static inline void netif_poll_enable(struct net_device *dev)
{
	clear_bit(__LINK_STATE_RX_SCHED, &dev->state);
}

/* same as netif_rx_complete, except that local_irq_save(flags)
 * has already been issued
 */
static inline void __netif_rx_complete(struct net_device *dev)
{
	BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
	list_del(&dev->poll_list);
	smp_mb__before_clear_bit();
	clear_bit(__LINK_STATE_RX_SCHED, &dev->state);
}

static inline void netif_tx_lock(struct net_device *dev)
{
	spin_lock(&dev->_xmit_lock);
	dev->xmit_lock_owner = smp_processor_id();
}

static inline void netif_tx_lock_bh(struct net_device *dev)
{
	spin_lock_bh(&dev->_xmit_lock);
	dev->xmit_lock_owner = smp_processor_id();
}

static inline int netif_tx_trylock(struct net_device *dev)
{
	int ok = spin_trylock(&dev->_xmit_lock);
	if (likely(ok))
		dev->xmit_lock_owner = smp_processor_id();
	return ok;
}

static inline void netif_tx_unlock(struct net_device *dev)
{
	dev->xmit_lock_owner = -1;
	spin_unlock(&dev->_xmit_lock);
}

static inline void netif_tx_unlock_bh(struct net_device *dev)
{
	dev->xmit_lock_owner = -1;
	spin_unlock_bh(&dev->_xmit_lock);
}

static inline void netif_tx_disable(struct net_device *dev)
{
	netif_tx_lock_bh(dev);
	netif_stop_queue(dev);
	netif_tx_unlock_bh(dev);
}

/* These functions live elsewhere (drivers/net/net_init.c, but related) */

extern void		ether_setup(struct net_device *dev);

/* Support for loadable net-drivers */
extern struct net_device *alloc_netdev(int sizeof_priv, const char *name,
				       void (*setup)(struct net_device *));
extern int		register_netdev(struct net_device *dev);
extern void		unregister_netdev(struct net_device *dev);
/* Functions used for multicast support */
extern void		dev_mc_upload(struct net_device *dev);
extern int 		dev_mc_delete(struct net_device *dev, void *addr, int alen, int all);
extern int		dev_mc_add(struct net_device *dev, void *addr, int alen, int newonly);
extern void		dev_mc_discard(struct net_device *dev);
extern void		dev_set_promiscuity(struct net_device *dev, int inc);
extern void		dev_set_allmulti(struct net_device *dev, int inc);
extern void		netdev_state_change(struct net_device *dev);
extern void		netdev_features_change(struct net_device *dev);
/* Load a device via the kmod */
extern void		dev_load(const char *name);
extern void		dev_mcast_init(void);
extern int		netdev_max_backlog;
extern int		weight_p;
extern int		netdev_set_master(struct net_device *dev, struct net_device *master);
extern int skb_checksum_help(struct sk_buff *skb, int inward);
extern struct sk_buff *skb_gso_segment(struct sk_buff *skb, int features);
#ifdef CONFIG_BUG
extern void netdev_rx_csum_fault(struct net_device *dev);
#else
static inline void netdev_rx_csum_fault(struct net_device *dev)
{
}
#endif
/* rx skb timestamps */
extern void		net_enable_timestamp(void);
extern void		net_disable_timestamp(void);

#ifdef CONFIG_PROC_FS
extern void *dev_seq_start(struct seq_file *seq, loff_t *pos);
extern void *dev_seq_next(struct seq_file *seq, void *v, loff_t *pos);
extern void dev_seq_stop(struct seq_file *seq, void *v);
#endif

extern void linkwatch_run_queue(void);

static inline int net_gso_ok(int features, int gso_type)
{
	int feature = gso_type << NETIF_F_GSO_SHIFT;
	return (features & feature) == feature;
}

static inline int skb_gso_ok(struct sk_buff *skb, int features)
{
	return net_gso_ok(features, skb_shinfo(skb)->gso_type);
}

static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb)
{
	return skb_is_gso(skb) &&
	       (!skb_gso_ok(skb, dev->features) ||
		unlikely(skb->ip_summed != CHECKSUM_HW));
}

/* On bonding slaves other than the currently active slave, suppress
 * duplicates except for 802.3ad ETH_P_SLOW and alb non-mcast/bcast.
 */
static inline int skb_bond_should_drop(struct sk_buff *skb)
{
	struct net_device *dev = skb->dev;
	struct net_device *master = dev->master;

	if (master &&
	    (dev->priv_flags & IFF_SLAVE_INACTIVE)) {
		if (master->priv_flags & IFF_MASTER_ALB) {
			if (skb->pkt_type != PACKET_BROADCAST &&
			    skb->pkt_type != PACKET_MULTICAST)
				return 0;
		}
		if (master->priv_flags & IFF_MASTER_8023AD &&
		    skb->protocol == __constant_htons(ETH_P_SLOW))
			return 0;

		return 1;
	}
	return 0;
}

#endif /* __KERNEL__ */

#endif	/* _LINUX_DEV_H */
Comment 5 Anonymous Emailer 2007-10-04 20:24:27 UTC
Reply-To: shemminger@linux-foundation.org

On Thu, 04 Oct 2007 18:27:04 -0700
Tina Yang <tina.yang@oracle.com> wrote:

> Andrew Morton wrote:
> > (Please resoind by emailed reply-to-all, not via the bugzilla web
> interface)
> >
> > On Thu,  4 Oct 2007 16:24:18 -0700 (PDT)
> > bugme-daemon@bugzilla.kernel.org wrote:
> >
> >   
> >> http://bugzilla.kernel.org/show_bug.cgi?id=9124
> >>
> >>            Summary: Netconsole race crashed the system
> >>            Product: Networking
> >>            Version: 2.5
> >>      KernelVersion: 2.6.9, 2.6.18, 2.6.23
> >>           Platform: All
> >>         OS/Version: Linux
> >>               Tree: Mainline
> >>             Status: NEW
> >>           Severity: high
> >>           Priority: P1
> >>          Component: Other
> >>         AssignedTo: acme@ghostprotocols.net
> >>         ReportedBy: tina.yang@oracle.com
> >>
> >>
> >> Most recent kernel where this bug did not occur:
> >> Think the problem has always been there.
> >> Distribution:
> >> Hardware Environment:
> >> DELL PowerEdge 2650 (x86)
> >> DELL PowerEdge 2850(x86_64)
> >> HP ProLiant DL380 G5 (x86_64) 
> >> with various NICs - e1000, tg3, bnx2
> >> Software Environment:
> >> 2.6.9, 2.6.18, 2.6.23
> >> Problem Description:
> >> On 2.6.18 found this issue on e1000 and tg3. On mainline 2.6.23-rc* found
> this
> >>  issue on e100,tgs and bnx2.  It either panicked
> >> at netdevice.h:890 or hung the system, and sometimes depending
> >> on which NIC are used, the following console message,
> >>  e1000:
> >>       "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang"
> >>  tg3:
> >>       "NETDEV WATCHDOG: eth4: transmit timed out"
> >>       "tg3: eth4: transmit timed out, resetting"
> >>
> >> Steps to reproduce:
> >> 1. On 2.6.18 (both x86_x86_64) insert netconsole module.(NIC: e1000 and
> tg3)
> >> 2. Run a moderate io load , preferably fio - one process doing
> async+directIO
> >> using libaio 
> >>
> >> fio jobfile:
> >> [global]
> >> iodepth=1024
> >> iodepth_batch=60
> >> randrepeat=1
> >> size=1024m
> >> directory=/home/oracle
> >> numjobs=2
> >> [job1]
> >> bs=8k
> >> direct=1
> >> ioengine=libaio
> >> rw=randrw
> >> filename=file1:file2
> >>
> >> 3. From second console as root do " echo t > /proc/sysrq-trigger"
> >>
> >> Machine will instantly hang.
> >>
> >>
> >> Crash stack captured on 2.6.9
> >>        PANIC: "kernel BUG at include/linux/netdevice.h:888!"
> >> #0 [ 23c5e60] disk_dump at f9ca71a2
> >> #1 [ 23c5e64] printk at 21228d6
> >> #2 [ 23c5e70] freeze_other_cpus at f9ca6ef5
> >> #3 [ 23c5e80] start_disk_dump at f9ca6fa0
> >> #4 [ 23c5e90] try_crashdump at 2133766
> >> #5 [ 23c5e98] die at 2106354
> >> #6 [ 23c5ecc] do_invalid_op at 210672f
> >> #7 [ 23c5f7c] error_code (via invalid_op) at fffecede
> >>    EAX: 00000006  EBX: 00200202  ECX: 00000000  EDX: df287000  EBP:
> e05ca000
> >>    DS:  007b      ESI: 00000001  ES:  007b      EDI: e05ca240 
> >>    CS:  0060      EIP: f8c82a08  ERR: ffffffff  EFLAGS: 00210046 
> >> #8 [ 23c5fb8] tg3_poll at f8c82a08
> >> #9 [ 23c5fd0] net_rx_action at 227a8da
> >> #10 [ 23c5fe8] __do_softirq at 2126422
> >> --- <soft IRQ> ---
> >> #0 [25c71cac] do_softirq at 2108460
> >> #1 [25c71cb4] dev_queue_xmit at 227a0d2
> >> #2 [25c71ccc] ip_finish_output at 229288d
> >> #3 [25c71ce4] ip_queue_xmit at 2292fa9
> >> #4 [25c71dac] tcp_transmit_skb at 22a0ff7
> >> #5 [25c71dec] tcp_write_xmit at 22a1901
> >> #6 [25c71e10] tcp_sendmsg at 2297d6d
> >> #7 [25c71e80] sock_aio_write at 2272512
> >> #8 [25c71eec] do_sync_write at 215a444
> >> #9 [25c71f88] vfs_write at 215a53a
> >> #10 [25c71fa4] sys_write at 215a5f4
> >> #11 [25c71fc0] system_call at fffec219 
> >>
> >> net_device in memory,
> >>   name = "eth0\000\000\000\000\000\000\000\000\000\000\000", 
> >>  ...
> >>
> >>
> >> Crash stack captured on 2.6.18
> >>        PANIC: "kernel BUG at include/linux/netdevice.h:890!"
> >>  #0 [c072ce30] crash_kexec at c044418a
> >>  #1 [c072ce74] die at c04054d0
> >>  #2 [c072cea4] do_invalid_op at c0405c20
> >>  #3 [c072cf54] error_code (via invalid_op) at c0404ab3
> >>     EAX: 00000007  EBX: 00000202  ECX: 00000000  EDX: f6d9c000  EBP:
> f6d9c400 
> >>     DS:  007b      ESI: 00000001  ES:  007b      EDI: cb02b280 
> >>     CS:  0060      EIP: f8927791  ERR: ffffffff  EFLAGS: 00010046 
> >>  #4 [c072cf88] tg3_poll at f8927791
> >> --- <soft IRQ> ---
> >>  #0 [f7e54f60] do_softirq at c0406433
> >>  #1 [f7e54f6c] do_IRQ at c0406425
> >>  #2 [f7e54fb4] cpu_idle at c0402c8e
> >>
> >> net_device in memory,
> >>   name = "eth4\000\000\000\000\000\000\000\000\000\000\000", 
> >>   name_hlist = {
> >>     next = 0x0, 
> >>     pprev = 0xc07d0148
> >>   }, 
> >>   ...
> >>
> >>     
> >
> > OK, but in my 2.6.18, include/linux/netdevice.h:890 is a
> > local_irq_restore() in netif_rx_complete().  I don't see how that can go
> > BUG.
> >
> > Does your 2.6.18 have any patches applied?
> >
> > Please tell us what is at include/linux/netdevice.h:890 in your 2.6.18
> > tree.
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >   
> 
>     netdevice.h attached.
>     890         BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
>    

Comparing your version with the original 2.6.18 from kernel.org git shows:

--- 2.6.18/include/linux/netdevice.h	2007-10-04 20:14:51.000000000 -0700
+++ tina/include/linux//netdevice.h	2007-10-04 20:16:19.000000000 -0700
@@ -342,6 +342,9 @@
 	/* Instance data managed by the core of Wireless Extensions. */
 	struct iw_public_data *	wireless_data;
 
+	/* pending config used by cfg80211/wext compat code only */
+	void *cfg80211_wext_pending_config;
+
 	struct ethtool_ops *ethtool_ops;
 
 	/*
@@ -386,6 +389,7 @@
 	void                    *ip6_ptr;       /* IPv6 specific data */
 	void			*ec_ptr;	/* Econet specific data	*/
 	void			*ax25_ptr;	/* AX.25 specific data */
+	void			*ieee80211_ptr;	/* IEEE 802.11 specific data */
 
 /*
  * Cache line mostly used on receive path (including eth_type_trans())


So you are not using a "pure" v2.6.18 kernel rom kernel.org but more likely
a distribution kernel that had already integrated the mac80211 stuff.
Comment 6 Anonymous Emailer 2007-10-04 20:24:55 UTC
Reply-To: shemminger@linux-foundation.org

On Thu, 04 Oct 2007 18:27:04 -0700
Tina Yang <tina.yang@oracle.com> wrote:

> Andrew Morton wrote:
> > (Please resoind by emailed reply-to-all, not via the bugzilla web
> interface)
> >
> > On Thu,  4 Oct 2007 16:24:18 -0700 (PDT)
> > bugme-daemon@bugzilla.kernel.org wrote:
> >
> >   
> >> http://bugzilla.kernel.org/show_bug.cgi?id=9124
> >>
> >>            Summary: Netconsole race crashed the system
> >>            Product: Networking
> >>            Version: 2.5
> >>      KernelVersion: 2.6.9, 2.6.18, 2.6.23
> >>           Platform: All
> >>         OS/Version: Linux
> >>               Tree: Mainline
> >>             Status: NEW
> >>           Severity: high
> >>           Priority: P1
> >>          Component: Other
> >>         AssignedTo: acme@ghostprotocols.net
> >>         ReportedBy: tina.yang@oracle.com
> >>
> >>
> >> Most recent kernel where this bug did not occur:
> >> Think the problem has always been there.
> >> Distribution:
> >> Hardware Environment:
> >> DELL PowerEdge 2650 (x86)
> >> DELL PowerEdge 2850(x86_64)
> >> HP ProLiant DL380 G5 (x86_64) 
> >> with various NICs - e1000, tg3, bnx2
> >> Software Environment:
> >> 2.6.9, 2.6.18, 2.6.23
> >> Problem Description:
> >> On 2.6.18 found this issue on e1000 and tg3. On mainline 2.6.23-rc* found
> this
> >>  issue on e100,tgs and bnx2.  It either panicked
> >> at netdevice.h:890 or hung the system, and sometimes depending
> >> on which NIC are used, the following console message,
> >>  e1000:
> >>       "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang"
> >>  tg3:
> >>       "NETDEV WATCHDOG: eth4: transmit timed out"
> >>       "tg3: eth4: transmit timed out, resetting"
> >>
> >> Steps to reproduce:
> >> 1. On 2.6.18 (both x86_x86_64) insert netconsole module.(NIC: e1000 and
> tg3)
> >> 2. Run a moderate io load , preferably fio - one process doing
> async+directIO
> >> using libaio 
> >>
> >> fio jobfile:
> >> [global]
> >> iodepth=1024
> >> iodepth_batch=60
> >> randrepeat=1
> >> size=1024m
> >> directory=/home/oracle
> >> numjobs=2
> >> [job1]
> >> bs=8k
> >> direct=1
> >> ioengine=libaio
> >> rw=randrw
> >> filename=file1:file2
> >>
> >> 3. From second console as root do " echo t > /proc/sysrq-trigger"
> >>
> >> Machine will instantly hang.
> >>
> >>
> >> Crash stack captured on 2.6.9
> >>        PANIC: "kernel BUG at include/linux/netdevice.h:888!"
> >> #0 [ 23c5e60] disk_dump at f9ca71a2
> >> #1 [ 23c5e64] printk at 21228d6
> >> #2 [ 23c5e70] freeze_other_cpus at f9ca6ef5
> >> #3 [ 23c5e80] start_disk_dump at f9ca6fa0
> >> #4 [ 23c5e90] try_crashdump at 2133766
> >> #5 [ 23c5e98] die at 2106354
> >> #6 [ 23c5ecc] do_invalid_op at 210672f
> >> #7 [ 23c5f7c] error_code (via invalid_op) at fffecede
> >>    EAX: 00000006  EBX: 00200202  ECX: 00000000  EDX: df287000  EBP:
> e05ca000
> >>    DS:  007b      ESI: 00000001  ES:  007b      EDI: e05ca240 
> >>    CS:  0060      EIP: f8c82a08  ERR: ffffffff  EFLAGS: 00210046 
> >> #8 [ 23c5fb8] tg3_poll at f8c82a08
> >> #9 [ 23c5fd0] net_rx_action at 227a8da
> >> #10 [ 23c5fe8] __do_softirq at 2126422
> >> --- <soft IRQ> ---
> >> #0 [25c71cac] do_softirq at 2108460
> >> #1 [25c71cb4] dev_queue_xmit at 227a0d2
> >> #2 [25c71ccc] ip_finish_output at 229288d
> >> #3 [25c71ce4] ip_queue_xmit at 2292fa9
> >> #4 [25c71dac] tcp_transmit_skb at 22a0ff7
> >> #5 [25c71dec] tcp_write_xmit at 22a1901
> >> #6 [25c71e10] tcp_sendmsg at 2297d6d
> >> #7 [25c71e80] sock_aio_write at 2272512
> >> #8 [25c71eec] do_sync_write at 215a444
> >> #9 [25c71f88] vfs_write at 215a53a
> >> #10 [25c71fa4] sys_write at 215a5f4
> >> #11 [25c71fc0] system_call at fffec219 
> >>
> >> net_device in memory,
> >>   name = "eth0\000\000\000\000\000\000\000\000\000\000\000", 
> >>  ...
> >>
> >>
> >> Crash stack captured on 2.6.18
> >>        PANIC: "kernel BUG at include/linux/netdevice.h:890!"
> >>  #0 [c072ce30] crash_kexec at c044418a
> >>  #1 [c072ce74] die at c04054d0
> >>  #2 [c072cea4] do_invalid_op at c0405c20
> >>  #3 [c072cf54] error_code (via invalid_op) at c0404ab3
> >>     EAX: 00000007  EBX: 00000202  ECX: 00000000  EDX: f6d9c000  EBP:
> f6d9c400 
> >>     DS:  007b      ESI: 00000001  ES:  007b      EDI: cb02b280 
> >>     CS:  0060      EIP: f8927791  ERR: ffffffff  EFLAGS: 00010046 
> >>  #4 [c072cf88] tg3_poll at f8927791
> >> --- <soft IRQ> ---
> >>  #0 [f7e54f60] do_softirq at c0406433
> >>  #1 [f7e54f6c] do_IRQ at c0406425
> >>  #2 [f7e54fb4] cpu_idle at c0402c8e
> >>
> >> net_device in memory,
> >>   name = "eth4\000\000\000\000\000\000\000\000\000\000\000", 
> >>   name_hlist = {
> >>     next = 0x0, 
> >>     pprev = 0xc07d0148
> >>   }, 
> >>   ...
> >>
> >>     
> >
> > OK, but in my 2.6.18, include/linux/netdevice.h:890 is a
> > local_irq_restore() in netif_rx_complete().  I don't see how that can go
> > BUG.
> >
> > Does your 2.6.18 have any patches applied?
> >
> > Please tell us what is at include/linux/netdevice.h:890 in your 2.6.18
> > tree.
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >   
> 
>     netdevice.h attached.
>     890         BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
>    

Comparing your version with the original 2.6.18 from kernel.org git shows:

--- 2.6.18/include/linux/netdevice.h	2007-10-04 20:14:51.000000000 -0700
+++ tina/include/linux//netdevice.h	2007-10-04 20:16:19.000000000 -0700
@@ -342,6 +342,9 @@
 	/* Instance data managed by the core of Wireless Extensions. */
 	struct iw_public_data *	wireless_data;
 
+	/* pending config used by cfg80211/wext compat code only */
+	void *cfg80211_wext_pending_config;
+
 	struct ethtool_ops *ethtool_ops;
 
 	/*
@@ -386,6 +389,7 @@
 	void                    *ip6_ptr;       /* IPv6 specific data */
 	void			*ec_ptr;	/* Econet specific data	*/
 	void			*ax25_ptr;	/* AX.25 specific data */
+	void			*ieee80211_ptr;	/* IEEE 802.11 specific data */
 
 /*
  * Cache line mostly used on receive path (including eth_type_trans())


So you are not using a "pure" v2.6.18 kernel from kernel.org but more likely
a distribution kernel that had already integrated the mac80211 stuff.
Comment 7 Tina Yang 2007-10-04 21:00:16 UTC
Stephen Hemminger wrote:
> On Thu, 04 Oct 2007 18:27:04 -0700
> Tina Yang <tina.yang@oracle.com> wrote:
>
>   
>> Andrew Morton wrote:
>>     
>>> (Please resoind by emailed reply-to-all, not via the bugzilla web
>>> interface)
>>>
>>> On Thu,  4 Oct 2007 16:24:18 -0700 (PDT)
>>> bugme-daemon@bugzilla.kernel.org wrote:
>>>
>>>   
>>>       
>>>> http://bugzilla.kernel.org/show_bug.cgi?id=9124
>>>>
>>>>            Summary: Netconsole race crashed the system
>>>>            Product: Networking
>>>>            Version: 2.5
>>>>      KernelVersion: 2.6.9, 2.6.18, 2.6.23
>>>>           Platform: All
>>>>         OS/Version: Linux
>>>>               Tree: Mainline
>>>>             Status: NEW
>>>>           Severity: high
>>>>           Priority: P1
>>>>          Component: Other
>>>>         AssignedTo: acme@ghostprotocols.net
>>>>         ReportedBy: tina.yang@oracle.com
>>>>
>>>>
>>>> Most recent kernel where this bug did not occur:
>>>> Think the problem has always been there.
>>>> Distribution:
>>>> Hardware Environment:
>>>> DELL PowerEdge 2650 (x86)
>>>> DELL PowerEdge 2850(x86_64)
>>>> HP ProLiant DL380 G5 (x86_64) 
>>>> with various NICs - e1000, tg3, bnx2
>>>> Software Environment:
>>>> 2.6.9, 2.6.18, 2.6.23
>>>> Problem Description:
>>>> On 2.6.18 found this issue on e1000 and tg3. On mainline 2.6.23-rc* found
>>>> this
>>>>  issue on e100,tgs and bnx2.  It either panicked
>>>> at netdevice.h:890 or hung the system, and sometimes depending
>>>> on which NIC are used, the following console message,
>>>>  e1000:
>>>>       "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang"
>>>>  tg3:
>>>>       "NETDEV WATCHDOG: eth4: transmit timed out"
>>>>       "tg3: eth4: transmit timed out, resetting"
>>>>
>>>> Steps to reproduce:
>>>> 1. On 2.6.18 (both x86_x86_64) insert netconsole module.(NIC: e1000 and
>>>> tg3)
>>>> 2. Run a moderate io load , preferably fio - one process doing
>>>> async+directIO
>>>> using libaio 
>>>>
>>>> fio jobfile:
>>>> [global]
>>>> iodepth=1024
>>>> iodepth_batch=60
>>>> randrepeat=1
>>>> size=1024m
>>>> directory=/home/oracle
>>>> numjobs=2
>>>> [job1]
>>>> bs=8k
>>>> direct=1
>>>> ioengine=libaio
>>>> rw=randrw
>>>> filename=file1:file2
>>>>
>>>> 3. From second console as root do " echo t > /proc/sysrq-trigger"
>>>>
>>>> Machine will instantly hang.
>>>>
>>>>
>>>> Crash stack captured on 2.6.9
>>>>        PANIC: "kernel BUG at include/linux/netdevice.h:888!"
>>>> #0 [ 23c5e60] disk_dump at f9ca71a2
>>>> #1 [ 23c5e64] printk at 21228d6
>>>> #2 [ 23c5e70] freeze_other_cpus at f9ca6ef5
>>>> #3 [ 23c5e80] start_disk_dump at f9ca6fa0
>>>> #4 [ 23c5e90] try_crashdump at 2133766
>>>> #5 [ 23c5e98] die at 2106354
>>>> #6 [ 23c5ecc] do_invalid_op at 210672f
>>>> #7 [ 23c5f7c] error_code (via invalid_op) at fffecede
>>>>    EAX: 00000006  EBX: 00200202  ECX: 00000000  EDX: df287000  EBP:
>>>>    e05ca000
>>>>    DS:  007b      ESI: 00000001  ES:  007b      EDI: e05ca240 
>>>>    CS:  0060      EIP: f8c82a08  ERR: ffffffff  EFLAGS: 00210046 
>>>> #8 [ 23c5fb8] tg3_poll at f8c82a08
>>>> #9 [ 23c5fd0] net_rx_action at 227a8da
>>>> #10 [ 23c5fe8] __do_softirq at 2126422
>>>> --- <soft IRQ> ---
>>>> #0 [25c71cac] do_softirq at 2108460
>>>> #1 [25c71cb4] dev_queue_xmit at 227a0d2
>>>> #2 [25c71ccc] ip_finish_output at 229288d
>>>> #3 [25c71ce4] ip_queue_xmit at 2292fa9
>>>> #4 [25c71dac] tcp_transmit_skb at 22a0ff7
>>>> #5 [25c71dec] tcp_write_xmit at 22a1901
>>>> #6 [25c71e10] tcp_sendmsg at 2297d6d
>>>> #7 [25c71e80] sock_aio_write at 2272512
>>>> #8 [25c71eec] do_sync_write at 215a444
>>>> #9 [25c71f88] vfs_write at 215a53a
>>>> #10 [25c71fa4] sys_write at 215a5f4
>>>> #11 [25c71fc0] system_call at fffec219 
>>>>
>>>> net_device in memory,
>>>>   name = "eth0\000\000\000\000\000\000\000\000\000\000\000", 
>>>>  ...
>>>>
>>>>
>>>> Crash stack captured on 2.6.18
>>>>        PANIC: "kernel BUG at include/linux/netdevice.h:890!"
>>>>  #0 [c072ce30] crash_kexec at c044418a
>>>>  #1 [c072ce74] die at c04054d0
>>>>  #2 [c072cea4] do_invalid_op at c0405c20
>>>>  #3 [c072cf54] error_code (via invalid_op) at c0404ab3
>>>>     EAX: 00000007  EBX: 00000202  ECX: 00000000  EDX: f6d9c000  EBP:
>>>>     f6d9c400 
>>>>     DS:  007b      ESI: 00000001  ES:  007b      EDI: cb02b280 
>>>>     CS:  0060      EIP: f8927791  ERR: ffffffff  EFLAGS: 00010046 
>>>>  #4 [c072cf88] tg3_poll at f8927791
>>>> --- <soft IRQ> ---
>>>>  #0 [f7e54f60] do_softirq at c0406433
>>>>  #1 [f7e54f6c] do_IRQ at c0406425
>>>>  #2 [f7e54fb4] cpu_idle at c0402c8e
>>>>
>>>> net_device in memory,
>>>>   name = "eth4\000\000\000\000\000\000\000\000\000\000\000", 
>>>>   name_hlist = {
>>>>     next = 0x0, 
>>>>     pprev = 0xc07d0148
>>>>   }, 
>>>>   ...
>>>>
>>>>     
>>>>         
>>> OK, but in my 2.6.18, include/linux/netdevice.h:890 is a
>>> local_irq_restore() in netif_rx_complete().  I don't see how that can go
>>> BUG.
>>>
>>> Does your 2.6.18 have any patches applied?
>>>
>>> Please tell us what is at include/linux/netdevice.h:890 in your 2.6.18
>>> tree.
>>>
>>> -
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>   
>>>       
>>     netdevice.h attached.
>>     890         BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
>>    
>>     
>
> Comparing your version with the original 2.6.18 from kernel.org git shows:
>
> --- 2.6.18/include/linux/netdevice.h  2007-10-04 20:14:51.000000000 -0700
> +++ tina/include/linux//netdevice.h   2007-10-04 20:16:19.000000000 -0700
> @@ -342,6 +342,9 @@
>       /* Instance data managed by the core of Wireless Extensions. */
>       struct iw_public_data * wireless_data;
>  
> +     /* pending config used by cfg80211/wext compat code only */
> +     void *cfg80211_wext_pending_config;
> +
>       struct ethtool_ops *ethtool_ops;
>  
>       /*
> @@ -386,6 +389,7 @@
>       void                    *ip6_ptr;       /* IPv6 specific data */
>       void                    *ec_ptr;        /* Econet specific data */
>       void                    *ax25_ptr;      /* AX.25 specific data */
> +     void                    *ieee80211_ptr; /* IEEE 802.11 specific data */
>  
>  /*
>   * Cache line mostly used on receive path (including eth_type_trans())
>
>
> So you are not using a "pure" v2.6.18 kernel rom kernel.org but more likely
> a distribution kernel that had already integrated the mac80211 stuff.
>
>
>   
    Yes, it's RHEL5 2.6.18-8.  Attached is the 2.6.9-42 version that 
doesn't have 802.11 and
    crashed at the same spot - netdevice.h:888.  Also crashed are 
2.6.23-rc2 and rc4.


/*
 * INET		An implementation of the TCP/IP protocol suite for the LINUX
 *		operating system.  INET is implemented using the  BSD Socket
 *		interface as the means of communication with the user level.
 *
 *		Definitions for the Interfaces handler.
 *
 * Version:	@(#)dev.h	1.0.10	08/12/93
 *
 * Authors:	Ross Biro, <bir7@leland.Stanford.Edu>
 *		Fred N. van Kempen, <waltje@uWalt.NL.Mugnet.ORG>
 *		Corey Minyard <wf-rch!minyard@relay.EU.net>
 *		Donald J. Becker, <becker@cesdis.gsfc.nasa.gov>
 *		Alan Cox, <Alan.Cox@linux.org>
 *		Bjorn Ekwall. <bj0rn@blox.se>
 *              Pekka Riikonen <priikone@poseidon.pspt.fi>
 *
 *		This program is free software; you can redistribute it and/or
 *		modify it under the terms of the GNU General Public License
 *		as published by the Free Software Foundation; either version
 *		2 of the License, or (at your option) any later version.
 *
 *		Moved to /usr/include/linux for NET3
 */
#ifndef _LINUX_NETDEVICE_H
#define _LINUX_NETDEVICE_H

#include <linux/if.h>
#include <linux/if_ether.h>
#include <linux/if_packet.h>

#ifdef __KERNEL__
#include <asm/atomic.h>
#include <asm/cache.h>
#include <asm/byteorder.h>

#include <linux/config.h>
#include <linux/device.h>
#include <linux/percpu.h>

struct divert_blk;
struct vlan_group;
struct ethtool_ops;
struct netpoll;
struct netpoll_info;

					/* source back-compat hooks */
#define SET_ETHTOOL_OPS(netdev,ops) \
	( (netdev)->ethtool_ops = (ops) )

#define HAVE_ALLOC_NETDEV		/* feature macro: alloc_xxxdev
					   functions are available. */
#define HAVE_FREE_NETDEV		/* free_netdev() */
#define HAVE_NETDEV_PRIV		/* netdev_priv() */

#define NET_XMIT_SUCCESS	0
#define NET_XMIT_DROP		1	/* skb dropped			*/
#define NET_XMIT_CN		2	/* congestion notification	*/
#define NET_XMIT_POLICED	3	/* skb is shot by police	*/
#define NET_XMIT_BYPASS		4	/* packet does not leave via dequeue;
					   (TC use only - dev_queue_xmit
					   returns this as NET_XMIT_SUCCESS) */

/* Backlog congestion levels */
#define NET_RX_SUCCESS		0   /* keep 'em coming, baby */
#define NET_RX_DROP		1  /* packet dropped */
#define NET_RX_CN_LOW		2   /* storm alert, just in case */
#define NET_RX_CN_MOD		3   /* Storm on its way! */
#define NET_RX_CN_HIGH		4   /* The storm is here */
#define NET_RX_BAD		5  /* packet dropped due to kernel error */

#define net_xmit_errno(e)	((e) != NET_XMIT_CN ? -ENOBUFS : 0)

#endif

#define MAX_ADDR_LEN	32		/* Largest hardware address length */

/* Driver transmit return codes */
#define NETDEV_TX_OK 0		/* driver took care of packet */
#define NETDEV_TX_BUSY 1	/* driver tx path was busy*/
#define NETDEV_TX_LOCKED -1	/* driver tx lock was already taken */

/*
 *	Compute the worst case header length according to the protocols
 *	used.
 */
 
#if !defined(CONFIG_AX25) && !defined(CONFIG_AX25_MODULE) && !defined(CONFIG_TR)
#define LL_MAX_HEADER	32
#else
#if defined(CONFIG_AX25) || defined(CONFIG_AX25_MODULE)
#define LL_MAX_HEADER	96
#else
#define LL_MAX_HEADER	48
#endif
#endif

#if !defined(CONFIG_NET_IPIP) && \
    !defined(CONFIG_IPV6) && !defined(CONFIG_IPV6_MODULE)
#define MAX_HEADER LL_MAX_HEADER
#else
#define MAX_HEADER (LL_MAX_HEADER + 48)
#endif

/*
 *	Network device statistics. Akin to the 2.0 ether stats but
 *	with byte counters.
 */
 
struct net_device_stats
{
	unsigned long	rx_packets;		/* total packets received	*/
	unsigned long	tx_packets;		/* total packets transmitted	*/
	unsigned long	rx_bytes;		/* total bytes received 	*/
	unsigned long	tx_bytes;		/* total bytes transmitted	*/
	unsigned long	rx_errors;		/* bad packets received		*/
	unsigned long	tx_errors;		/* packet transmit problems	*/
	unsigned long	rx_dropped;		/* no space in linux buffers	*/
	unsigned long	tx_dropped;		/* no space available in linux	*/
	unsigned long	multicast;		/* multicast packets received	*/
	unsigned long	collisions;

	/* detailed rx_errors: */
	unsigned long	rx_length_errors;
	unsigned long	rx_over_errors;		/* receiver ring buff overflow	*/
	unsigned long	rx_crc_errors;		/* recved pkt with crc error	*/
	unsigned long	rx_frame_errors;	/* recv'd frame alignment error */
	unsigned long	rx_fifo_errors;		/* recv'r fifo overrun		*/
	unsigned long	rx_missed_errors;	/* receiver missed packet	*/

	/* detailed tx_errors */
	unsigned long	tx_aborted_errors;
	unsigned long	tx_carrier_errors;
	unsigned long	tx_fifo_errors;
	unsigned long	tx_heartbeat_errors;
	unsigned long	tx_window_errors;
	
	/* for cslip etc */
	unsigned long	rx_compressed;
	unsigned long	tx_compressed;
};


/* Media selection options. */
enum {
        IF_PORT_UNKNOWN = 0,
        IF_PORT_10BASE2,
        IF_PORT_10BASET,
        IF_PORT_AUI,
        IF_PORT_100BASET,
        IF_PORT_100BASETX,
        IF_PORT_100BASEFX
};

#ifdef __KERNEL__

#include <linux/cache.h>
#include <linux/skbuff.h>

struct neighbour;
struct neigh_parms;
struct sk_buff;

struct netif_rx_stats
{
	unsigned total;
	unsigned dropped;
	unsigned time_squeeze;
	unsigned throttled;
	unsigned fastroute_hit;
	unsigned fastroute_success;
	unsigned fastroute_defer;
	unsigned fastroute_deferred_out;
	unsigned fastroute_latency_reduction;
	unsigned cpu_collision;
};

DECLARE_PER_CPU(struct netif_rx_stats, netdev_rx_stat);


/*
 *	We tag multicasts with these structures.
 */
 
struct dev_mc_list
{	
	struct dev_mc_list	*next;
	__u8			dmi_addr[MAX_ADDR_LEN];
	unsigned char		dmi_addrlen;
	int			dmi_users;
	int			dmi_gusers;
};

struct hh_cache
{
	struct hh_cache *hh_next;	/* Next entry			     */
	atomic_t	hh_refcnt;	/* number of users                   */
	unsigned short  hh_type;	/* protocol identifier, f.e ETH_P_IP
                                         *  NOTE:  For VLANs, this will be the
                                         *  encapuslated type. --BLG
                                         */
	int		hh_len;		/* length of header */
	int		(*hh_output)(struct sk_buff *skb);
	rwlock_t	hh_lock;

	/* cached hardware header; allow for machine alignment needs.        */
#define HH_DATA_MOD	16
#define HH_DATA_OFF(__len) \
	(HH_DATA_MOD - ((__len) & (HH_DATA_MOD - 1)))
#define HH_DATA_ALIGN(__len) \
	(((__len)+(HH_DATA_MOD-1))&~(HH_DATA_MOD - 1))
	unsigned long	hh_data[HH_DATA_ALIGN(LL_MAX_HEADER) / sizeof(long)];
};

/* Reserve HH_DATA_MOD byte aligned hard_header_len, but at least that much.
 * Alternative is:
 *   dev->hard_header_len ? (dev->hard_header_len +
 *                           (HH_DATA_MOD - 1)) & ~(HH_DATA_MOD - 1) : 0
 *
 * We could use other alignment values, but we must maintain the
 * relationship HH alignment <= LL alignment.
 */
#define LL_RESERVED_SPACE(dev) \
	(((dev)->hard_header_len&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
#define LL_RESERVED_SPACE_EXTRA(dev,extra) \
	((((dev)->hard_header_len+extra)&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)

/* These flag bits are private to the generic network queueing
 * layer, they may not be explicitly referenced by any other
 * code.
 */

enum netdev_state_t
{
	__LINK_STATE_XOFF=0,
	__LINK_STATE_START,
	__LINK_STATE_PRESENT,
	__LINK_STATE_SCHED,
	__LINK_STATE_NOCARRIER,
	__LINK_STATE_RX_SCHED,
	__LINK_STATE_LINKWATCH_PENDING
};


/*
 * This structure holds at boot time configured netdevice settings. They
 * are then used in the device probing. 
 */
struct netdev_boot_setup {
	char name[IFNAMSIZ];
	struct ifmap map;
};
#define NETDEV_BOOT_SETUP_MAX 8


/*
 *	The DEVICE structure.
 *	Actually, this whole structure is a big mistake.  It mixes I/O
 *	data with strictly "high-level" data, and it has to know about
 *	almost every data structure used in the INET module.
 *
 *	FIXME: cleanup struct net_device such that network protocol info
 *	moves out.
 */

struct net_device
{

	/*
	 * This is the first field of the "visible" part of this structure
	 * (i.e. as seen by users in the "Space.c" file).  It is the name
	 * the interface.
	 */
	char			name[IFNAMSIZ];

	/*
	 *	I/O specific fields
	 *	FIXME: Merge these and struct ifmap into one
	 */
	unsigned long		mem_end;	/* shared mem end	*/
	unsigned long		mem_start;	/* shared mem start	*/
	unsigned long		base_addr;	/* device I/O address	*/
	unsigned int		irq;		/* device IRQ number	*/

	/*
	 *	Some hardware also needs these fields, but they are not
	 *	part of the usual set specified in Space.c.
	 */

	unsigned char		if_port;	/* Selectable AUI, TP,..*/
	unsigned char		dma;		/* DMA channel		*/

	unsigned long		state;

	struct net_device	*next;
	
	/* The device initialization function. Called only once. */
	int			(*init)(struct net_device *dev);

	/* ------- Fields preinitialized in Space.c finish here ------- */

	struct net_device	*next_sched;

	/* Interface index. Unique device identifier	*/
	int			ifindex;
	int			iflink;


	struct net_device_stats* (*get_stats)(struct net_device *dev);
	struct iw_statistics*	(*get_wireless_stats)(struct net_device *dev);

	/* List of functions to handle Wireless Extensions (instead of ioctl).
	 * See <net/iw_handler.h> for details. Jean II */
	struct iw_handler_def *	wireless_handlers;

	struct ethtool_ops *ethtool_ops;

	/*
	 * This marks the end of the "visible" part of the structure. All
	 * fields hereafter are internal to the system, and may change at
	 * will (read: may be cleaned up at will).
	 */

	/* These may be needed for future network-power-down code. */
	unsigned long		trans_start;	/* Time (in jiffies) of last Tx	*/
	unsigned long		last_rx;	/* Time of last Rx	*/

	unsigned short		flags;	/* interface flags (a la BSD)	*/
	unsigned short		gflags;
        unsigned short          priv_flags; /* Like 'flags' but invisible to userspace. */
        unsigned short          unused_alignment_fixer; /* Because we need priv_flags,
                                                         * and we want to be 32-bit aligned.
                                                         */

	unsigned		mtu;	/* interface MTU value		*/
	unsigned short		type;	/* interface hardware type	*/
	unsigned short		hard_header_len;	/* hardware hdr length	*/
	void			*priv;	/* pointer to private data	*/

	struct net_device	*master; /* Pointer to master device of a group,
					  * which this device is member of.
					  */

	/* Interface address info. */
	unsigned char		broadcast[MAX_ADDR_LEN];	/* hw bcast add	*/
	unsigned char		dev_addr[MAX_ADDR_LEN];	/* hw address	*/
	unsigned char		addr_len;	/* hardware address length	*/
#ifndef __GENKSYMS__
	unsigned char		reserved;
	unsigned short		priv_len;
#endif

	struct dev_mc_list	*mc_list;	/* Multicast mac addresses	*/
	int			mc_count;	/* Number of installed mcasts	*/
	int			promiscuity;
	int			allmulti;

	int			watchdog_timeo;
	struct timer_list	watchdog_timer;

	/* Protocol specific pointers */
	
	void 			*atalk_ptr;	/* AppleTalk link 	*/
	void			*ip_ptr;	/* IPv4 specific data	*/  
	void                    *dn_ptr;        /* DECnet specific data */
	void                    *ip6_ptr;       /* IPv6 specific data */
	void			*ec_ptr;	/* Econet specific data	*/
	void			*ax25_ptr;	/* AX.25 specific data */

	struct list_head	poll_list;	/* Link to poll list	*/
	int			quota;
	int			weight;

	struct Qdisc		*qdisc;
	struct Qdisc		*qdisc_sleeping;
	struct Qdisc		*qdisc_ingress;
	struct list_head	qdisc_list;
	unsigned long		tx_queue_len;	/* Max frames per queue allowed */

	/* ingress path synchronizer */
	spinlock_t		ingress_lock;
	/* hard_start_xmit synchronizer */
	spinlock_t		xmit_lock;
	/* cpu id of processor entered to hard_start_xmit or -1,
	   if nobody entered there.
	 */
	int			xmit_lock_owner;
	/* device queue lock */
	spinlock_t		queue_lock;
	/* Number of references to this device */
	atomic_t		refcnt;
	/* delayed register/unregister */
	struct list_head	todo_list;
	/* device name hash chain */
	struct hlist_node	name_hlist;
	/* device index hash chain */
	struct hlist_node	index_hlist;

	/* register/unregister state machine */
	enum { NETREG_UNINITIALIZED=0,
	       NETREG_REGISTERING,	/* called register_netdevice */
	       NETREG_REGISTERED,	/* completed register todo */
	       NETREG_UNREGISTERING,	/* called unregister_netdevice */
	       NETREG_UNREGISTERED,	/* completed unregister todo */
	       NETREG_RELEASED,		/* called free_netdev */
	} reg_state;

	/* Net device features */
	int			features;
#define NETIF_F_SG		1	/* Scatter/gather IO. */
#define NETIF_F_IP_CSUM		2	/* Can checksum only TCP/UDP over IPv4. */
#define NETIF_F_NO_CSUM		4	/* Does not require checksum. F.e. loopack. */
#define NETIF_F_HW_CSUM		8	/* Can checksum all the packets. */
#define NETIF_F_HIGHDMA		32	/* Can DMA to high memory. */
#define NETIF_F_FRAGLIST	64	/* Scatter/gather IO. */
#define NETIF_F_HW_VLAN_TX	128	/* Transmit VLAN hw acceleration */
#define NETIF_F_HW_VLAN_RX	256	/* Receive VLAN hw acceleration */
#define NETIF_F_HW_VLAN_FILTER	512	/* Receive filtering on VLAN */
#define NETIF_F_VLAN_CHALLENGED	1024	/* Device cannot handle VLAN packets */
#define NETIF_F_TSO		2048	/* Can offload TCP/IP segmentation */
#define NETIF_F_LLTX		4096	/* LockLess TX */

	/* Called after device is detached from network. */
	void			(*uninit)(struct net_device *dev);
	/* Called after last user reference disappears. */
	void			(*destructor)(struct net_device *dev);

	/* Pointers to interface service routines.	*/
	int			(*open)(struct net_device *dev);
	int			(*stop)(struct net_device *dev);
	int			(*hard_start_xmit) (struct sk_buff *skb,
						    struct net_device *dev);
#define HAVE_NETDEV_POLL
	int			(*poll) (struct net_device *dev, int *quota);
	int			(*hard_header) (struct sk_buff *skb,
						struct net_device *dev,
						unsigned short type,
						void *daddr,
						void *saddr,
						unsigned len);
	int			(*rebuild_header)(struct sk_buff *skb);
#define HAVE_MULTICAST			 
	void			(*set_multicast_list)(struct net_device *dev);
#define HAVE_SET_MAC_ADDR  		 
	int			(*set_mac_address)(struct net_device *dev,
						   void *addr);
#define HAVE_PRIVATE_IOCTL
	int			(*do_ioctl)(struct net_device *dev,
					    struct ifreq *ifr, int cmd);
#define HAVE_SET_CONFIG
	int			(*set_config)(struct net_device *dev,
					      struct ifmap *map);
#define HAVE_HEADER_CACHE
	int			(*hard_header_cache)(struct neighbour *neigh,
						     struct hh_cache *hh);
	void			(*header_cache_update)(struct hh_cache *hh,
						       struct net_device *dev,
						       unsigned char *  haddr);
#define HAVE_CHANGE_MTU
	int			(*change_mtu)(struct net_device *dev, int new_mtu);

#define HAVE_TX_TIMEOUT
	void			(*tx_timeout) (struct net_device *dev);

	void			(*vlan_rx_register)(struct net_device *dev,
						    struct vlan_group *grp);
	void			(*vlan_rx_add_vid)(struct net_device *dev,
						   unsigned short vid);
	void			(*vlan_rx_kill_vid)(struct net_device *dev,
						    unsigned short vid);

	int			(*hard_header_parse)(struct sk_buff *skb,
						     unsigned char *haddr);
	int			(*neigh_setup)(struct net_device *dev, struct neigh_parms *);
	int			(*accept_fastpath)(struct net_device *, struct dst_entry*);
#ifdef CONFIG_NETPOLL
	int			netpoll_rx;
#endif
#ifdef CONFIG_NET_POLL_CONTROLLER
	void                    (*poll_controller)(struct net_device *dev);
#endif

	/* bridge stuff */
	struct net_bridge_port	*br_port;

#ifdef CONFIG_NET_DIVERT
	/* this will get initialized at each interface type init routine */
	struct divert_blk	*divert;
#endif /* CONFIG_NET_DIVERT */

	/* class/net/name entry */
	struct class_device	class_dev;
	/* how much padding had been added by alloc_netdev() */
	int padded;
};

/*
 *  Structure used to maintain kABI.
 */
struct net_device_wrapper {
	void (*netpoll_setup)(struct net_device *dev, struct netpoll_info *npinfo);
	int (*netpoll_start_xmit)(struct netpoll *np, struct sk_buff *skb,
				  struct net_device *dev);
	struct netpoll_info *npinfo;
};

#define	NETDEV_ALIGN		32
#define	NETDEV_ALIGN_CONST	(NETDEV_ALIGN - 1)

static inline void *netdev_priv(struct net_device *dev)
{
	return (char *)dev + ((sizeof(struct net_device)
					+ NETDEV_ALIGN_CONST)
				& ~NETDEV_ALIGN_CONST);
}

static inline struct net_device_wrapper *dev_wrapper(struct net_device *dev)
{
	if (!(dev->priv_flags & IFF_EXTENDED))
		return NULL;

	return (struct net_device_wrapper *) ((char *) netdev_priv(dev) +
		((dev->priv_len + NETDEV_ALIGN_CONST) & ~NETDEV_ALIGN_CONST));
}

#define SET_MODULE_OWNER(dev) do { } while (0)
/* Set the sysfs physical device reference for the network logical device
 * if set prior to registration will cause a symlink during initialization.
 */
#define SET_NETDEV_DEV(net, pdev)	((net)->class_dev.dev = (pdev))

struct packet_type {
	unsigned short		type;	/* This is really htons(ether_type).	*/
	struct net_device		*dev;	/* NULL is wildcarded here		*/
	int			(*func) (struct sk_buff *, struct net_device *,
					 struct packet_type *);
	void			*af_packet_priv;
	struct list_head	list;
};

#include <linux/interrupt.h>
#include <linux/notifier.h>

extern struct net_device		loopback_dev;		/* The loopback */
extern struct net_device		*dev_base;		/* All devices */
extern rwlock_t				dev_base_lock;		/* Device list lock */

extern int			netdev_boot_setup_add(char *name, struct ifmap *map);
extern int 			netdev_boot_setup_check(struct net_device *dev);
extern unsigned long		netdev_boot_base(const char *prefix, int unit);
extern struct net_device    *dev_getbyhwaddr(unsigned short type, char *hwaddr);
extern struct net_device *dev_getfirstbyhwtype(unsigned short type);
extern void		dev_add_pack(struct packet_type *pt);
extern void		dev_remove_pack(struct packet_type *pt);
extern void		__dev_remove_pack(struct packet_type *pt);

extern struct net_device	*dev_get_by_flags(unsigned short flags,
						  unsigned short mask);
extern struct net_device	*dev_get_by_name(const char *name);
extern struct net_device	*__dev_get_by_name(const char *name);
extern int		dev_alloc_name(struct net_device *dev, const char *name);
extern int		dev_open(struct net_device *dev);
extern int		dev_close(struct net_device *dev);
extern int		dev_queue_xmit(struct sk_buff *skb);
extern int		register_netdevice(struct net_device *dev);
extern int		unregister_netdevice(struct net_device *dev);
extern void		free_netdev(struct net_device *dev);
extern void		synchronize_net(void);
extern int 		register_netdevice_notifier(struct notifier_block *nb);
extern int		unregister_netdevice_notifier(struct notifier_block *nb);
extern int		call_netdevice_notifiers(unsigned long val, void *v);
extern struct net_device	*dev_get_by_index(int ifindex);
extern struct net_device	*__dev_get_by_index(int ifindex);
extern int		dev_restart(struct net_device *dev);
#ifdef CONFIG_NETPOLL_TRAP
extern int		netpoll_trap(void);
#endif

typedef int gifconf_func_t(struct net_device * dev, char __user * bufptr, int len);
extern int		register_gifconf(unsigned int family, gifconf_func_t * gifconf);
static inline int unregister_gifconf(unsigned int family)
{
	return register_gifconf(family, NULL);
}

/*
 * Incoming packets are placed on per-cpu queues so that
 * no locking is needed.
 */

struct softnet_data
{
	int			throttle;
	int			cng_level;
	int			avg_blog;
	struct sk_buff_head	input_pkt_queue;
	struct list_head	poll_list;
	struct net_device	*output_queue;
	struct sk_buff		*completion_queue;

	struct net_device	backlog_dev;	/* Sorry. 8) */
};

DECLARE_PER_CPU(struct softnet_data,softnet_data);

#define HAVE_NETIF_QUEUE

static inline void __netif_schedule(struct net_device *dev)
{
	if (!test_and_set_bit(__LINK_STATE_SCHED, &dev->state)) {
		unsigned long flags;
		struct softnet_data *sd;

		local_irq_save(flags);
		sd = &__get_cpu_var(softnet_data);
		dev->next_sched = sd->output_queue;
		sd->output_queue = dev;
		raise_softirq_irqoff(NET_TX_SOFTIRQ);
		local_irq_restore(flags);
	}
}

static inline void netif_schedule(struct net_device *dev)
{
	if (!test_bit(__LINK_STATE_XOFF, &dev->state))
		__netif_schedule(dev);
}

static inline void netif_start_queue(struct net_device *dev)
{
	clear_bit(__LINK_STATE_XOFF, &dev->state);
}

static inline void netif_wake_queue(struct net_device *dev)
{
	if (test_and_clear_bit(__LINK_STATE_XOFF, &dev->state))
		__netif_schedule(dev);
}

static inline void netif_stop_queue(struct net_device *dev)
{
	set_bit(__LINK_STATE_XOFF, &dev->state);
}

static inline int netif_queue_stopped(const struct net_device *dev)
{
	return test_bit(__LINK_STATE_XOFF, &dev->state);
}

static inline int netif_running(const struct net_device *dev)
{
	return test_bit(__LINK_STATE_START, &dev->state);
}


/* Use this variant when it is known for sure that it
 * is executing from interrupt context.
 */
static inline void dev_kfree_skb_irq(struct sk_buff *skb)
{
	if (atomic_dec_and_test(&skb->users)) {
		struct softnet_data *sd;
		unsigned long flags;

		local_irq_save(flags);
		sd = &__get_cpu_var(softnet_data);
		skb->next = sd->completion_queue;
		sd->completion_queue = skb;
		raise_softirq_irqoff(NET_TX_SOFTIRQ);
		local_irq_restore(flags);
	}
}

/* Use this variant in places where it could be invoked
 * either from interrupt or non-interrupt context.
 */
static inline void dev_kfree_skb_any(struct sk_buff *skb)
{
	if (in_irq() || irqs_disabled())
		dev_kfree_skb_irq(skb);
	else
		dev_kfree_skb(skb);
}

#define HAVE_NETIF_RX 1
extern int		netif_rx(struct sk_buff *skb);
#define HAVE_NETIF_RECEIVE_SKB 1
extern int		netif_receive_skb(struct sk_buff *skb);
extern int		dev_ioctl(unsigned int cmd, void __user *);
extern int		dev_ethtool(struct ifreq *);
extern unsigned		dev_get_flags(const struct net_device *);
extern int		dev_change_flags(struct net_device *, unsigned);
extern int		dev_change_name(struct net_device *, char *);
extern int		dev_set_mtu(struct net_device *, int);
extern void		dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev);

extern void		dev_init(void);

extern int		netdev_nit;

/* Post buffer to the network code from _non interrupt_ context.
 * see net/core/dev.c for netif_rx description.
 */
static inline int netif_rx_ni(struct sk_buff *skb)
{
       int err = netif_rx(skb);
       if (softirq_pending(smp_processor_id()))
               do_softirq();
       return err;
}

/* Called by rtnetlink.c:rtnl_unlock() */
extern void netdev_run_todo(void);

static inline void dev_put(struct net_device *dev)
{
	atomic_dec(&dev->refcnt);
}

#define __dev_put(dev) atomic_dec(&(dev)->refcnt)
#define dev_hold(dev) atomic_inc(&(dev)->refcnt)

/* Carrier loss detection, dial on demand. The functions netif_carrier_on
 * and _off may be called from IRQ context, but it is caller
 * who is responsible for serialization of these calls.
 */

extern void linkwatch_fire_event(struct net_device *dev);

static inline int netif_carrier_ok(const struct net_device *dev)
{
	return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
}

extern void __netdev_watchdog_up(struct net_device *dev);

static inline void netif_carrier_on(struct net_device *dev)
{
	if (test_and_clear_bit(__LINK_STATE_NOCARRIER, &dev->state))
		linkwatch_fire_event(dev);
	if (netif_running(dev))
		__netdev_watchdog_up(dev);
}

static inline void netif_carrier_off(struct net_device *dev)
{
	if (!test_and_set_bit(__LINK_STATE_NOCARRIER, &dev->state))
		linkwatch_fire_event(dev);
}

/* Hot-plugging. */
static inline int netif_device_present(struct net_device *dev)
{
	return test_bit(__LINK_STATE_PRESENT, &dev->state);
}

static inline void netif_device_detach(struct net_device *dev)
{
	if (test_and_clear_bit(__LINK_STATE_PRESENT, &dev->state) &&
	    netif_running(dev)) {
		netif_stop_queue(dev);
	}
}

static inline void netif_device_attach(struct net_device *dev)
{
	if (!test_and_set_bit(__LINK_STATE_PRESENT, &dev->state) &&
	    netif_running(dev)) {
		netif_wake_queue(dev);
 		__netdev_watchdog_up(dev);
	}
}

/*
 * Network interface message level settings
 */
#define HAVE_NETIF_MSG 1

enum {
	NETIF_MSG_DRV		= 0x0001,
	NETIF_MSG_PROBE		= 0x0002,
	NETIF_MSG_LINK		= 0x0004,
	NETIF_MSG_TIMER		= 0x0008,
	NETIF_MSG_IFDOWN	= 0x0010,
	NETIF_MSG_IFUP		= 0x0020,
	NETIF_MSG_RX_ERR	= 0x0040,
	NETIF_MSG_TX_ERR	= 0x0080,
	NETIF_MSG_TX_QUEUED	= 0x0100,
	NETIF_MSG_INTR		= 0x0200,
	NETIF_MSG_TX_DONE	= 0x0400,
	NETIF_MSG_RX_STATUS	= 0x0800,
	NETIF_MSG_PKTDATA	= 0x1000,
	NETIF_MSG_HW		= 0x2000,
	NETIF_MSG_WOL		= 0x4000,
};

#define netif_msg_drv(p)	((p)->msg_enable & NETIF_MSG_DRV)
#define netif_msg_probe(p)	((p)->msg_enable & NETIF_MSG_PROBE)
#define netif_msg_link(p)	((p)->msg_enable & NETIF_MSG_LINK)
#define netif_msg_timer(p)	((p)->msg_enable & NETIF_MSG_TIMER)
#define netif_msg_ifdown(p)	((p)->msg_enable & NETIF_MSG_IFDOWN)
#define netif_msg_ifup(p)	((p)->msg_enable & NETIF_MSG_IFUP)
#define netif_msg_rx_err(p)	((p)->msg_enable & NETIF_MSG_RX_ERR)
#define netif_msg_tx_err(p)	((p)->msg_enable & NETIF_MSG_TX_ERR)
#define netif_msg_tx_queued(p)	((p)->msg_enable & NETIF_MSG_TX_QUEUED)
#define netif_msg_intr(p)	((p)->msg_enable & NETIF_MSG_INTR)
#define netif_msg_tx_done(p)	((p)->msg_enable & NETIF_MSG_TX_DONE)
#define netif_msg_rx_status(p)	((p)->msg_enable & NETIF_MSG_RX_STATUS)
#define netif_msg_pktdata(p)	((p)->msg_enable & NETIF_MSG_PKTDATA)
#define netif_msg_hw(p)		((p)->msg_enable & NETIF_MSG_HW)
#define netif_msg_wol(p)	((p)->msg_enable & NETIF_MSG_WOL)

static inline u32 netif_msg_init(int debug_value, int default_msg_enable_bits)
{
	/* use default */
	if (debug_value < 0 || debug_value >= (sizeof(u32) * 8))
		return default_msg_enable_bits;
	if (debug_value == 0)	/* no output */
		return 0;
	/* set low N bits */
	return (1 << debug_value) - 1;
}

/* Schedule rx intr now? */

static inline int netif_rx_schedule_prep(struct net_device *dev)
{
	return netif_running(dev) &&
		!test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state);
}

/* Add interface to tail of rx poll list. This assumes that _prep has
 * already been called and returned 1.
 */

static inline void __netif_rx_schedule(struct net_device *dev)
{
	unsigned long flags;

	local_irq_save(flags);
	dev_hold(dev);
	list_add_tail(&dev->poll_list, &__get_cpu_var(softnet_data).poll_list);
	if (dev->quota < 0)
		dev->quota += dev->weight;
	else
		dev->quota = dev->weight;
	__raise_softirq_irqoff(NET_RX_SOFTIRQ);
	local_irq_restore(flags);
}

/* Try to reschedule poll. Called by irq handler. */

static inline void netif_rx_schedule(struct net_device *dev)
{
	if (netif_rx_schedule_prep(dev))
		__netif_rx_schedule(dev);
}

/* Try to reschedule poll. Called by dev->poll() after netif_rx_complete().
 * Do not inline this?
 */
static inline int netif_rx_reschedule(struct net_device *dev, int undo)
{
	if (netif_rx_schedule_prep(dev)) {
		unsigned long flags;

		dev->quota += undo;

		local_irq_save(flags);
		list_add_tail(&dev->poll_list, &__get_cpu_var(softnet_data).poll_list);
		__raise_softirq_irqoff(NET_RX_SOFTIRQ);
		local_irq_restore(flags);
		return 1;
	}
	return 0;
}

/* Remove interface from poll list: it must be in the poll list
 * on current cpu. This primitive is called by dev->poll(), when
 * it completes the work. The device cannot be out of poll list at this
 * moment, it is BUG().
 */
static inline void netif_rx_complete(struct net_device *dev)
{
	unsigned long flags;

	local_irq_save(flags);
	BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
	list_del(&dev->poll_list);
	smp_mb__before_clear_bit();
	clear_bit(__LINK_STATE_RX_SCHED, &dev->state);
	local_irq_restore(flags);
}

static inline void netif_poll_disable(struct net_device *dev)
{
	while (test_and_set_bit(__LINK_STATE_RX_SCHED, &dev->state)) {
		/* No hurry. */
		current->state = TASK_INTERRUPTIBLE;
		schedule_timeout(1);
	}
}

static inline void netif_poll_enable(struct net_device *dev)
{
	clear_bit(__LINK_STATE_RX_SCHED, &dev->state);
}

/* same as netif_rx_complete, except that local_irq_save(flags)
 * has already been issued
 */
static inline void __netif_rx_complete(struct net_device *dev)
{
	BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
	list_del(&dev->poll_list);
	smp_mb__before_clear_bit();
	clear_bit(__LINK_STATE_RX_SCHED, &dev->state);
}

static inline void netif_tx_disable(struct net_device *dev)
{
	spin_lock_bh(&dev->xmit_lock);
	netif_stop_queue(dev);
	spin_unlock_bh(&dev->xmit_lock);
}

/* These functions live elsewhere (drivers/net/net_init.c, but related) */

extern void		ether_setup(struct net_device *dev);
extern void		fddi_setup(struct net_device *dev);
extern void		tr_setup(struct net_device *dev);
extern void		fc_setup(struct net_device *dev);
extern void		fc_freedev(struct net_device *dev);
/* Support for loadable net-drivers */
extern struct net_device *alloc_netdev(int sizeof_priv, const char *name,
				       void (*setup)(struct net_device *));
extern int		register_netdev(struct net_device *dev);
extern void		unregister_netdev(struct net_device *dev);
/* Functions used for multicast support */
extern void		dev_mc_upload(struct net_device *dev);
extern int 		dev_mc_delete(struct net_device *dev, void *addr, int alen, int all);
extern int		dev_mc_add(struct net_device *dev, void *addr, int alen, int newonly);
extern void		dev_mc_discard(struct net_device *dev);
extern void		dev_set_promiscuity(struct net_device *dev, int inc);
extern void		dev_set_allmulti(struct net_device *dev, int inc);
extern void		netdev_state_change(struct net_device *dev);
/* Load a device via the kmod */
extern void		dev_load(const char *name);
extern void		dev_mcast_init(void);
extern int		netdev_max_backlog;
extern int		weight_p;
extern unsigned long	netdev_fc_xoff;
extern atomic_t netdev_dropping;
extern int		netdev_set_master(struct net_device *dev, struct net_device *master);
extern int skb_checksum_help(struct sk_buff **pskb, int inward);

#ifdef CONFIG_SYSCTL
extern char *net_sysctl_strdup(const char *s);
#endif

#endif /* __KERNEL__ */

#endif	/* _LINUX_DEV_H */
Comment 8 David S. Miller 2007-10-04 22:22:22 UTC
From: Tina Yang <tina.yang@oracle.com>
Date: Thu, 04 Oct 2007 20:56:50 -0700

>     Yes, it's RHEL5 2.6.18-8.  Attached is the 2.6.9-42 version that 
> doesn't have 802.11 and
>     crashed at the same spot - netdevice.h:888.  Also crashed are 
> 2.6.23-rc2 and rc4.

2.6.9-42 is, again, a vendor modified kernel.

Please show the crash information against vanilla upstream kernels if
you want us to take a serious look at this, thanks a lot.
Comment 9 Neil Horman 2007-10-05 13:28:42 UTC
I've got an upstream patch to this in two commits here:
http://git.infradead.org/?p=users/nhorman/linux-2.6.git;a=shortlog;h=netpoll

I've not reproduced the problem, but have sent the patch to ingo for testing.

This problem was origionally reported upstream and fixed here:
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.24.git;a=commit;h=29578624e354f56143d92510fff33a8b2aaa2c03
and then later reverted here:
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.24.git;a=commit;h=2e27afb300b56d83bb03fbfa68852b9c1e2920c6

The patches in my tree are my attempt to fix it while avoiding the e1000 regression as I understood it.  Hopefully ingo will have tested it soon, and it is also being tested at the customer site on the older Red Hat 2.6.9-42 kernel
Comment 10 Tina Yang 2007-10-05 18:36:17 UTC
David Miller wrote:
> From: Tina Yang <tina.yang@oracle.com>
> Date: Thu, 04 Oct 2007 20:56:50 -0700
>
>   
>>     Yes, it's RHEL5 2.6.18-8.  Attached is the 2.6.9-42 version that 
>> doesn't have 802.11 and
>>     crashed at the same spot - netdevice.h:888.  Also crashed are 
>> 2.6.23-rc2 and rc4.
>>     
>
> 2.6.9-42 is, again, a vendor modified kernel.
>
> Please show the crash information against vanilla upstream kernels if
> you want us to take a serious look at this, thanks a lot.
>   
Ok, I ran it on 2.6.23-rc8, hang.
Console said  panicked at tg3_poll+0x747 - netdevice.h:1008, same place ?
I am not sure if I can get a core dump at this point...

 =======================
Code:  0f b7 42 10 3b 43 60 0f 85 a3 00 00 00 85 c9 0f 94 c0 0f b6 f0 85 
f6 0f 84
 95 00 00 00 9c 5b fa 8b 6c 24 08 8b 45 2c a8 20 75 04 <0f> 0b eb fe 8b 
44 24 08
 05 80 01 00 00 e8 09 d3 b3 c7 8b 44 24
EIP: [<f89a975d>] tg3_poll+0x747/0x7ee (tg3) SS:ESP 0068:c0753f80
Kernel panic - not syncing: Fatal exception in interrupt
Comment 11 Neil Horman 2007-10-08 06:20:50 UTC
hmm, do you have the rest of this crash?  And can you post the patch that you used to test with?  The fact that you got a BUG in netdevice.h:1008 suggests that the __LINK_STATE_RX_SCHED flag was cleared while the __LINK_STATE_NETPOLL_SERVICED bit was never set.  This in my mind suggests that:
1) Netpoll isn't involved in this crash (which makes the above path moot)
2) Someone is clearing the RX_SCHED flag without holding the poll_lock, which would be bad.
Comment 12 Tina Yang 2007-10-16 20:27:49 UTC
bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9124
> 
> 
> 
> 
> 
> ------- Comment #11 from nhorman@tuxdriver.com  2007-10-08 06:20 -------
> hmm, do you have the rest of this crash?  And can you post the patch that you
> used to test with?  The fact that you got a BUG in netdevice.h:1008 suggests
> that the __LINK_STATE_RX_SCHED flag was cleared while the
> __LINK_STATE_NETPOLL_SERVICED bit was never set.  This in my mind suggests
> that:
> 1) Netpoll isn't involved in this crash (which makes the above path moot)
> 2) Someone is clearing the RX_SCHED flag without holding the poll_lock, which
> would be bad.
> 
> 

The above panic was from a vanilla 2.6.23-rc8.  Top of the stack dump had rolled
out of the console window.
[<f90c583f>] xs tcp send request+0x5a/0x11c [sunrpc]
 [<f90c47a1>] xprt transmit+0xca/0x1b8 [sunrpc]
 [<f8eb3134>] nfs3 xdr writeargs+0x0/0x7f [nfs]
 [<f90c848a>] rpcauth wrap req+0x7a/0x84 [sunrpc]
 [<f90c7d54>] rpcauth marshcred+0x48/0x4f [sunrpc]
 [<f8eb3134>] nfs3 xdr writeargs+0x0/0x7f [nfs]
 [<f90c248a>] call transmit+0x1c0/0x1f3 [sunrpc]
 [<f90c21b9>] call reserve+0x3c/0x65 [sunrpc]
 [<f90c7490>]   rpc execute+0x6f/0x1fe [sunrpc]
 [<c061c051>] lpck kernel+0x14/0x2e
 [<f90c761f>] rpc async schedule+0x0/0x8 [sunrpc]
 [<c0435234>] run workqueue+0x74/0xf7
 [<c0435a28>] worker thread+0x0/0xc4
 [<c0435ae2>] worker thread+0xba/0xc4
 [<c04380c9>] autoremove wake function+0x0/0x35
 [<c0438001>] kthread+0x38/0x5f
 [<c0437fc9>] kthread+0x0/0x5f
 [<c04059f3>] kernel thread helper+0x7/0x10

I tried your patch on 2.6.23-rc8 (below) as well as a vanilla 2.6.23.1,
both hung the system as soon as I started the test.  But this time, no messages
shown on the console.



diff -uprN -X linux-2.6.23-rc8/Documentation/dontdiff linux-2.6.23-rc8/include/linux/netdevice.h linux-2.6.23-rc8-rh/include/linux/netdevice.h
--- linux-2.6.23-rc8/include/linux/netdevice.h	2007-10-09 13:47:04.000000000 -0700
+++ linux-2.6.23-rc8-rh/include/linux/netdevice.h	2007-10-12 13:53:27.000000000 -0700
@@ -262,6 +262,7 @@ enum netdev_state_t
 	__LINK_STATE_LINKWATCH_PENDING,
 	__LINK_STATE_DORMANT,
 	__LINK_STATE_QDISC_RUNNING,
+	__LINK_STATE_NETPOLL_SERVICED,
 };


@@ -1005,6 +1006,8 @@ static inline int netif_rx_reschedule(st
  */
 static inline void __netif_rx_complete(struct net_device *dev)
 {
+	if (test_bit(__LINK_STATE_NETPOLL_SERVICED, &dev->state))
+		return;
 	BUG_ON(!test_bit(__LINK_STATE_RX_SCHED, &dev->state));
 	list_del(&dev->poll_list);
 	smp_mb__before_clear_bit();
@@ -1020,6 +1023,8 @@ static inline void netif_rx_complete(str
 {
 	unsigned long flags;

+	if (test_bit(__LINK_STATE_NETPOLL_SERVICED, &dev->state))
+		return;
 	local_irq_save(flags);
 	__netif_rx_complete(dev);
 	local_irq_restore(flags);
diff -uprN -X linux-2.6.23-rc8/Documentation/dontdiff linux-2.6.23-rc8/net/core/dev.c linux-2.6.23-rc8-rh/net/core/dev.c
--- linux-2.6.23-rc8/net/core/dev.c	2007-10-09 13:47:06.000000000 -0700
+++ linux-2.6.23-rc8-rh/net/core/dev.c	2007-10-12 13:56:00.000000000 -0700
@@ -2085,6 +2085,27 @@ static void net_rx_action(struct softirq
 				 struct net_device, poll_list);
 		have = netpoll_poll_lock(dev);

+		if (test_and_clear_bit(__LINK_STATE_NETPOLL_SERVICED, &dev->state)) {
+			/*
+			 * Seeing this bit set indicates that the netpoll code
+			 * has handled this devices poll routine despite the fact
+			 * that we have it on this cpu's poll_list.  Lets do
+			 * some fixup here to ensure that we don't have any
+			 * inconsistent state.
+			 */
+			list_del(&dev->poll_list);
+			clear_bit(__LINK_STATE_RX_SCHED, &dev->state);
+			/*
+			 * Check to see if any pending frames have come in
+			 * while we were mucking about.  If they have lets reschedule
+			 * this device
+			 */
+			if (dev->quota > 0)
+				netif_rx_schedule(dev);
+			netpoll_poll_unlock(have);
+			continue;
+		}
+
 		if (dev->quota <= 0 || dev->poll(dev, &budget)) {
 			netpoll_poll_unlock(have);
 			local_irq_disable();
diff -uprN -X linux-2.6.23-rc8/Documentation/dontdiff linux-2.6.23-rc8/net/core/netpoll.c linux-2.6.23-rc8-rh/net/core/netpoll.c
--- linux-2.6.23-rc8/net/core/netpoll.c	2007-10-09 13:47:06.000000000 -0700
+++ linux-2.6.23-rc8-rh/net/core/netpoll.c	2007-10-12 14:11:16.000000000 -0700
@@ -121,9 +121,16 @@ static void poll_napi(struct netpoll *np
 	struct netpoll_info *npinfo = np->dev->npinfo;
 	int budget = 16;

-	if (test_bit(__LINK_STATE_RX_SCHED, &np->dev->state) &&
-	    npinfo->poll_owner != smp_processor_id() &&
+	if (npinfo->poll_owner != smp_processor_id() &&
 	    spin_trylock(&npinfo->poll_lock)) {
+	    /* net_rx_action already got to us */
+	        if (!test_bit(__LINK_STATE_RX_SCHED, &np->dev->state))
+			goto skip;
+		/*
+		 * This gets cleared in net_rx_action
+		 */
+		set_bit(__LINK_STATE_NETPOLL_SERVICED, &np->dev->state);
+
 		npinfo->rx_flags |= NETPOLL_RX_DROP;
 		atomic_inc(&trapped);

@@ -131,6 +138,7 @@ static void poll_napi(struct netpoll *np

 		atomic_dec(&trapped);
 		npinfo->rx_flags &= ~NETPOLL_RX_DROP;
+skip:
 		spin_unlock(&npinfo->poll_lock);
 	}
 }
Comment 13 Asbjørn Sannes 2007-11-13 04:47:06 UTC
I am working on the kernel a lot, and it seems that when I output alot to netconsole at the same time as userspace (like ssh) I get a kernel panic. The trigger for me is when I do tar -jxvf somearchive triggers it while tar -jxf somearchive does not trigger it .. Currently running 2.6.22.12: I get the following trace:

------------[ cut here ]------------
kernel BUG at include/linux/netdevice.h:918!
invalid opcode: 0000 [1] SMP
CPU 1
Modules linked in:
Pid: 4971, comm: sshd Not tainted 2.6.22.12 #1
RIP: 0010:[<ffffffff803d52a1>]  [<ffffffff803d52a1>] tg3_poll+0x7fd/0x8e4
RSP: 0018:ffff8100010ffea8  EFLAGS: 00010046
RAX: 0000000000000006 RBX: 0000000000000202 RCX: 0000000000000000
RDX: ffff810004728000 RSI: ffff8100010fff4c RDI: ffff810004728000
RBP: ffff8100010fff28 R08: 0000000000000002 R09: 0000000000000000
R10: ffffffff804990c5 R11: ffff81000499bbb8 R12: ffff810004728980
R13: 0000000000000001 R14: ffff810001016b80 R15: 000000010001c8ac
FS:  00002ae93793bef0(0000) GS:ffff8100010eedc0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000584000 CR3: 0000000004965000 CR4: 00000000000006e0
Process sshd (pid: 4971, threadinfo ffff81000499a000, task ffff81000421c140)
Stack:  ffff810001356820 ffff8100010fff4c ffff810004728000 ffff810001283000
 0000000000000000 ffff810001016bd0 ffff8100010fff08 ffffffff8033c048
 ffff810001356808 0000000000000000 ffff810001016bd0 ffff810004728000
Call Trace:
 <IRQ>  [<ffffffff8033c048>] _raw_spin_lock+0x90/0xf9
 [<ffffffff804990f4>] net_rx_action+0xb0/0x188
 [<ffffffff80237e83>] __do_softirq+0x4b/0xe3
 [<ffffffff80237e97>] __do_softirq+0x5f/0xe3
 [<ffffffff8020a6bc>] call_softirq+0x1c/0x28
 <EOI>  [<ffffffff8020ba34>] do_softirq+0x39/0x9f
 [<ffffffff80237c0b>] local_bh_enable+0x102/0x12b
 [<ffffffff80499412>] dev_queue_xmit+0x246/0x274
 [<ffffffff804e0aee>] ip6_output2+0x2ef/0x363
 [<ffffffff804e191c>] ip6_output+0xc4d/0xc71
 [<ffffffff80291e1c>] free_poll_entry+0x21/0x25
 [<ffffffff80291e4a>] poll_freewait+0x2a/0x6b
 [<ffffffff804e23f3>] ip6_xmit+0x343/0x421
 [<ffffffff80502257>] inet6_csk_xmit+0x1e4/0x1fd
 [<ffffffff804c08b0>] tcp_transmit_skb+0x6a9/0x6ea
 [<ffffffff8024cbfd>] trace_hardirqs_on+0x114/0x138
 [<ffffffff804c2677>] __tcp_push_pending_frames+0x7ee/0x8ba
 [<ffffffff8027f9fd>] __kmalloc_node+0x25/0x2a
 [<ffffffff804928b9>] __alloc_skb+0x84/0x13b
 [<ffffffff804b798c>] tcp_sendmsg+0x9ca/0xadd
 [<ffffffff8048c0f0>] sock_aio_write+0xde/0xf2
 [<ffffffff80285747>] do_sync_write+0xe2/0x126
 [<ffffffff80245b8c>] autoremove_wake_function+0x0/0x38
 [<ffffffff8051ccf0>] _spin_unlock_irq+0x2b/0x31
 [<ffffffff8024cbfd>] trace_hardirqs_on+0x114/0x138
 [<ffffffff80285f77>] vfs_write+0xe2/0x158
 [<ffffffff802864ed>] sys_write+0x47/0x70
 [<ffffffff802094be>] system_call+0x7e/0x83


Code: 0f 0b eb fe 48 8b 4d 90 48 8b 45 90 48 81 c1 00 02 00 00 48
RIP  [<ffffffff803d52a1>] tg3_poll+0x7fd/0x8e4
 RSP <ffff8100010ffea8>
Kernel panic - not syncing: Aiee, killing interrupt handler!


Not sure if it is the same issue, if it is not I apologize. 
Comment 14 Alan 2008-09-26 05:27:37 UTC
Closing out old stale bug (please reopen if seen with recent kernels)

Note You need to log in before you can comment on or make changes to this bug.