Bug 12957 - Kernel Oops on R/W from/to dspcfg_workaround of netsami NIC parameter
Summary: Kernel Oops on R/W from/to dspcfg_workaround of netsami NIC parameter
Status: CLOSED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 low
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-03-28 02:15 UTC by Nick
Modified: 2012-05-30 15:00 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.24 (at least >=)
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Nick 2009-03-28 02:15:24 UTC
Kernel oopses on reading dspcfg_workaround of a natsemi network card.

# cat /sys/devices/pci0000:00/0000:00:1e.0/0000:06:0d.0/dspcfg_workaround
Segmentation fault

# /bin/echo "on" > /sys/devices/pci0000:00/0000:00:1e.0/0000:06:0d.0/dspcfg_workaround
Segmentation fault


on-reading dmesg:
--------------CUT-------------
[213929.279480] ---[ end trace fb46f6931925b2eb ]---
[214792.977099] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000380
[214792.977212] printing eip: c03c8fde *pdpt = 00000000373de001 *pde = 0000000000000000
[214792.977413] Oops: 0000 [#9] SMP
[214792.977549] Modules linked in: sr_mod cdrom
[214792.977729]
[214792.977784] Pid: 21113, comm: cat Tainted: G      D (2.6.24.7-40 #1)
[214792.977849] EIP: 0060:[<c03c8fde>] EFLAGS: 00010292 CPU: 1
[214792.977916] EIP is at natsemi_show_dspcfg_workaround+0xe/0x40
[214792.977975] EAX: c07fc1d6 EBX: c03c8fd0 ECX: f6871000 EDX: 00000000
[214792.978038] ESI: f7d5acc4 EDI: f7ffdd20 EBP: ef545a14 ESP: f7373f38
[214792.978099]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[214792.978161] Process cat (pid: 21113, ti=f7372000 task=f08a85e0 task.ti=f7372000)
[214792.978222] Stack: 000080d0 fffffff4 f7ffdd20 c034a0bd ef545a00 c08ccb74 c01b0497 ed74b580
[214792.978661]        f7fbfe00 00001000 0804d038 c08ccb74 f7d5acc4 f3364500 0804d038 c01b0430
[214792.979098]        00001000 c016e96d f7373fa0 00000006 f3364500 fffffff7 00001000 f7372000
[214792.979536] Call Trace:
[214792.979640]  [<c034a0bd>] dev_attr_show+0x1d/0x30
[214792.979746]  [<c01b0497>] sysfs_read_file+0x67/0xe0
[214792.979850]  [<c01b0430>] sysfs_read_file+0x0/0xe0
[214792.979956]  [<c016e96d>] vfs_read+0x9d/0x160
[214792.980063]  [<c016eec1>] sys_read+0x41/0x70
[214792.980165]  [<c0102b86>] sysenter_past_esp+0x5f/0x85
[214792.980268]  =======================
[214792.980323] Code: 5f 5d e9 f6 7e 2a 00 83 c4 10 5b 5e 5f 5d c3 8d b4 26 00 00 00 00 8d bc 27 00 00 00 00 83 ec 0c 8b 90 14 ff ff ff b8 d6 c1 7f c0 <8b> 92 80 03 00 00 c7 44 24 04 16 99 81 c0 89 0c 24 85 d2 ba 8e
[214792.983042] EIP: [<c03c8fde>] natsemi_show_dspcfg_workaround+0xe/0x40 SS:ESP 0068:f7373f38
[214792.983213] ---[ end trace fb46f6931925b2eb ]---
--------------/CUT-------------

on-writing dmesg:
--------------CUT-------------
[213453.081746] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000390
[213453.081864] printing eip: c0712983 *pdpt = 0000000029f0e001 *pde = 0000000000000000
[213453.082053] Oops: 0002 [#7] SMP
[213453.082181] Modules linked in: sr_mod cdrom
[213453.082351]
[213453.082402] Pid: 17467, comm: echo Tainted: G      D (2.6.24.7-40 #1)
[213453.082459] EIP: 0060:[<c0712983>] EFLAGS: 00010046 CPU: 3
[213453.082520] EIP is at _spin_lock_irqsave+0x3/0x30
[213453.082574] EAX: 00000390 EBX: 00000390 ECX: ffffffff EDX: 00000246
[213453.082629] ESI: 00000001 EDI: 00000000 EBP: 00000003 ESP: e9c0ff30
[213453.082684]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[213453.082741] Process echo (pid: 17467, ti=e9c0e000 task=f0a411a0 task.ti=e9c0e000)
[213453.082797] Stack: c03c9067 c03c9010 00000003 f7d5acc4 00000003 c034a0f5 00000003 c08ccb74
[213453.083213]        f7ffdd20 c01b03e0 00000003 b7f4e000 f28b6c00 f6b25e94 c08ccb74 f28b6c00
[213453.083624]        b7f4e000 c01b0320 00000003 c016e80f e9c0ffa0 00000014 f28b6c00 fffffff7
[213453.084039] Call Trace:
[213453.084136]  [<c03c9067>] natsemi_set_dspcfg_workaround+0x57/0xb0
[213453.084237]  [<c03c9010>] natsemi_set_dspcfg_workaround+0x0/0xb0
[213453.084332]  [<c034a0f5>] dev_attr_store+0x25/0x40
[213453.084429]  [<c01b03e0>] sysfs_write_file+0xc0/0x110
[213453.084527]  [<c01b0320>] sysfs_write_file+0x0/0x110
[213453.084621]  [<c016e80f>] vfs_write+0x9f/0x160
[213453.084717]  [<c016ef31>] sys_write+0x41/0x70
[213453.084810]  [<c0102b86>] sysenter_past_esp+0x5f/0x85
[213453.084906]  [<c0710000>] gss_wrap_kerberos+0x10/0x270
[213453.085002]  =======================
[213453.085054] Code: 04 24 e9 51 02 a1 ff 90 f0 ff 00 8b 04 24 e9 45 02 a1 ff 90 8d 74 26 00 c6 00 01 8b 04 24 e9 35 02 a1 ff
 90 8d 74 26 00 9c 5a fa <f0> fe 08 79 1c f7 c2 00 02 00 00 74 0b fb f3 90 80 38 00 7e f9
[213453.087671] EIP: [<c0712983>] _spin_lock_irqsave+0x3/0x30 SS:ESP 0068:e9c0ff30
[213453.087820] ---[ end trace fb46f6931925b2eb ]---
--------------/CUT-------------


Ubuntu-8.04 kernel branch 2.6.24.x
But the same code is unchanged in 2.6.27.x

The problem should be simple. Here is it:
--------------CUT------------
static ssize_t natsemi_show_dspcfg_workaround(struct device *dev,
                                              struct device_attribute *attr,
                                              char *buf)
{
        struct netdev_private *np = netdev_priv(to_net_dev(dev));

        return sprintf(buf, "%s\n", np->dspcfg_workaround ? "on" : "off");
}
-------------/CUT-------------

Part "netdev_priv(to_net_dev(dev))" returns nonsense (NULL?) for some reason.
The same and for natsemi_set_dspcfg_workaround().

It looks like structure "netdev_priv(to_net_dev(dev))" fails at some stage (to_net_dev() ?) and there is no check of to_net_dev(). So, the code returns nonsense.

There are only 2 other places in .24 kernel where such non-checked converting is used (drivers/net/gianfar_sysfs.c; drivers/infiniband/ulp/ipoib/ipoib_main.c), possible there could some probs too (just assumption).

Thanks
Comment 1 Denis Kirjanov 2009-03-28 09:40:47 UTC
Try to add some debugging messages. You need to check the return value of the to_net_dev() macro
Comment 2 Nick 2009-03-29 20:25:44 UTC
Unfortunately, I have no possibility to do this.
(Nekak, servak v productione :)

But I guess I provided enough info to track the problem down.
There should be a place where (struct device *dev)->(struct netdev) pointer is not initialized. It possible even somewhere in sysfs.
Comment 3 Andrew Morton 2009-04-07 23:49:08 UTC
Yes, arg `dev' is NULL.  Odd.

Can you retest something more recent, preferably 2.6.29?

Note You need to log in before you can comment on or make changes to this bug.