Bug 9674 (elv-modprobe-oops) - Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc, mmc_core
Summary: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc, mmc_core
Status: CLOSED CODE_FIX
Alias: elv-modprobe-oops
Product: Other
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 low
Assignee: other_other
URL:
Keywords:
Depends on:
Blocks: 9243
  Show dependency tree
 
Reported: 2008-01-01 14:55 UTC by Nick
Modified: 2008-01-03 14:59 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.24-rc6-git7
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Kernel config (43.42 KB, text/plain)
2008-01-01 14:59 UTC, Nick
Details

Description Nick 2008-01-01 14:55:43 UTC
Hardware Environment: x86_64
Software Environment: GNU
Problem Description: Oops during removing some MMC modules.
MMC controller was automatically detected and its module loaded (with several dependencies) on my laptop. I don't use it for now - so started to remove the modules. Here were the exact commands:
# rmmod sdhci
(sorry, don't have `lsmod` before all this. but 'sdhci' was just above 'sr_mod' module in the list)
# lsmod
Module                  Size  Used by
iwl3945               176904  0
tcp_westwood            3136  0
ipt_REJECT              4224  7
xt_state                2560  1
iptable_filter          3072  1
ipt_owner               2176  2
ipt_MASQUERADE          3520  1
xt_tcpudp               3648  17
xt_multiport            3584  4
iptable_nat             6340  1
nf_nat                 19628  2 ipt_MASQUERADE,iptable_nat
nf_conntrack_ipv4      16144  3 iptable_nat
nf_conntrack           66032  5 xt_state,ipt_MASQUERADE,iptable_nat,nf_nat,nf_conntrack_ipv4
ip_tables              19064  2 iptable_filter,iptable_nat
x_tables               16992  8 ipt_REJECT,xt_state,ipt_owner,ipt_MASQUERADE,xt_tcpudp,xt_multiport,iptable_nat,ip_tables
sr_mod                 16740  0
ricoh_mmc               3776  0
mmc_core               51424  0
cdrom                  37288  1 sr_mod
# rmmod sr_mod ricoh_mmc mmc_core

and here came the Oops.
Here is dmesg:
------------CUT------------
[   40.137967] ricoh-mmc: Controller is now re-enabled.
[   40.158455] Unable to handle kernel paging request at ffffffff8804cc80 RIP:
[   40.158460]  [<ffffffff8804cc80>]
[   40.158464] PGD 203067 PUD 207063 PMD 3dca9067 PTE 0
[   40.158468] Oops: 0010 [1] PREEMPT SMP
[   40.158472] CPU 1
[   40.158473] Modules linked in: sr_mod iwl3945 tcp_westwood ipt_REJECT xt_state iptable_filter ipt_owner ipt_MASQUERADE xt_tcpudp xt_multiport iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack ip_tables x_tables cdrom
[   40.158488] Pid: 8008, comm: modprobe Not tainted 2.6.24-rc6-git7 #3
[   40.158490] RIP: 0010:[<ffffffff8804cc80>]  [<ffffffff8804cc80>]
[   40.158493] RSP: 0000:ffff81003df37a40  EFLAGS: 00010082
[   40.158494] RAX: ffffffff8804cc80 RBX: ffff81003f13a7e0 RCX: 0000000000000000
[   40.158496] RDX: 0000000000000000 RSI: ffff810029634360 RDI: ffff81003f13a7e0
[   40.158498] RBP: ffff810029634360 R08: 0000000000000000 R09: 0000000000000001
[   40.158500] R10: ffffffff8058bba0 R11: ffffffff803f7f40 R12: ffff81003df37a50
[   40.158502] R13: ffffffff00000010 R14: 0000000100000010 R15: ffff81003f023878
[   40.158505] FS:  00002b3efdc88b00(0000) GS:ffff81003f00c580(0000) knlGS:0000000000000000
[   40.158507] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   40.158509] CR2: ffffffff8804cc80 CR3: 000000003d3a4000 CR4: 00000000000006a0
[   40.158511] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   40.158513] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   40.158515] Process modprobe (pid: 8008, threadinfo ffff81003df36000, task ffff81003ddc2000)
[   40.158517] Stack:  ffffffff80331800 ffff81003f13a7e0 ffff810029634360 0000000000000001
[   40.158522]  ffff81003f023800 ffff81003f224000 ffff81003f13a7e0 0000000000000000
[   40.158526]  ffffffff803f7fa9 ffff810029634360 ffff81003f023b38 ffff81003f023968
[   40.158530] Call Trace:
[   40.158537]  [<ffffffff80331800>] elv_next_request+0xe0/0x170
[   40.158543]  [<ffffffff803f7fa9>] scsi_request_fn+0x69/0x360
[   40.158547]  [<ffffffff803352a2>] blk_execute_rq_nowait+0x62/0xb0
[   40.158551]  [<ffffffff80336d6a>] blk_execute_rq+0x8a/0x100
[   40.158553]  [<ffffffff803347e5>] get_request+0x155/0x2d0
[   40.158557]  [<ffffffff803f7d64>] scsi_execute+0xe4/0x120
[   40.158560]  [<ffffffff803f7e31>] scsi_execute_req+0x91/0x100
[   40.158565]  [<ffffffff8800b70b>] :sr_mod:sr_probe+0x21b/0x630
[   40.158570]  [<ffffffff802fa0a9>] sysfs_create_link+0xa9/0x150
[   40.158575]  [<ffffffff805620a5>] klist_next+0x35/0xe0
[   40.158579]  [<ffffffff803c34cc>] driver_probe_device+0x9c/0x1b0
[   40.158582]  [<ffffffff803c37b1>] __driver_attach+0xe1/0xf0
[   40.158585]  [<ffffffff803c36d0>] __driver_attach+0x0/0xf0
[   40.158589]  [<ffffffff803c26bd>] bus_for_each_dev+0x4d/0x80
[   40.158592]  [<ffffffff803c2acf>] bus_add_driver+0xaf/0x220
[   40.158596]  [<ffffffff8801102c>] :sr_mod:init_sr+0x2c/0x47
[   40.158601]  [<ffffffff80267856>] sys_init_module+0x196/0x1a20
[   40.158607]  [<ffffffff8020bdee>] system_call+0x7e/0x83
[   40.158609]
[   40.158610]
[   40.158611] Code:  Bad RIP value.
[   40.158615] RIP  [<ffffffff8804cc80>]
[   40.158617]  RSP <ffff81003df37a40>
[   40.158618] CR2: ffffffff8804cc80
[   40.158622] ---[ end trace 105b7f7d35157dd9 ]---
[   40.158624] note: modprobe[8008] exited with preempt_count 1
[   40.158627] BUG: sleeping function called from invalid context at kernel/rwsem.c:21
[   40.158629] in_atomic():1, irqs_disabled():0
[   40.158631] INFO: lockdep is turned off.
[   40.158632] Pid: 8008, comm: modprobe Tainted: G      D 2.6.24-rc6-git7 #3
[   40.158634]
[   40.158634] Call Trace:
[   40.158638]  [<ffffffff805655c5>] down_read+0x15/0x40
[   40.158641]  [<ffffffff8023c73b>] exit_mm+0x2b/0x110
[   40.158644]  [<ffffffff8023e1af>] do_exit+0x1af/0x900
[   40.158649]  [<ffffffff80226659>] do_page_fault+0x539/0x7c0
[   40.158653]  [<ffffffff80342d01>] __up_read+0x21/0xb0
[   40.158656]  [<ffffffff80342d01>] __up_read+0x21/0xb0
[   40.158660]  [<ffffffff802964c9>] handle_mm_fault+0x1c9/0x840
[   40.158663]  [<ffffffff8028b19c>] get_page_from_freelist+0x35c/0x570
[   40.158667]  [<ffffffff805670fd>] error_exit+0x0/0xa9
[   40.158670]  [<ffffffff803f7f40>] scsi_request_fn+0x0/0x360
[   40.158674]  [<ffffffff80331800>] elv_next_request+0xe0/0x170
[   40.158677]  [<ffffffff803f7fa9>] scsi_request_fn+0x69/0x360
[   40.158680]  [<ffffffff803352a2>] blk_execute_rq_nowait+0x62/0xb0
[   40.158683]  [<ffffffff80336d6a>] blk_execute_rq+0x8a/0x100
[   40.158686]  [<ffffffff803347e5>] get_request+0x155/0x2d0
[   40.158690]  [<ffffffff803f7d64>] scsi_execute+0xe4/0x120
[   40.158692]  [<ffffffff803f7e31>] scsi_execute_req+0x91/0x100
[   40.158697]  [<ffffffff8800b70b>] :sr_mod:sr_probe+0x21b/0x630
[   40.158700]  [<ffffffff802fa0a9>] sysfs_create_link+0xa9/0x150
[   40.158703]  [<ffffffff805620a5>] klist_next+0x35/0xe0
[   40.158706]  [<ffffffff803c34cc>] driver_probe_device+0x9c/0x1b0
[   40.158709]  [<ffffffff803c37b1>] __driver_attach+0xe1/0xf0
[   40.158711]  [<ffffffff803c36d0>] __driver_attach+0x0/0xf0
[   40.158714]  [<ffffffff803c26bd>] bus_for_each_dev+0x4d/0x80
[   40.158717]  [<ffffffff803c2acf>] bus_add_driver+0xaf/0x220
[   40.158721]  [<ffffffff8801102c>] :sr_mod:init_sr+0x2c/0x47
[   40.158724]  [<ffffffff80267856>] sys_init_module+0x196/0x1a20
[   40.158729]  [<ffffffff8020bdee>] system_call+0x7e/0x83
[   40.158732]
----------/CUT-------------

my .config is attached. Trying to reproduce now.
Comment 1 Nick 2008-01-01 14:59:28 UTC
Created attachment 14256 [details]
Kernel config
Comment 2 Nick 2008-01-01 15:00:28 UTC
BTW, `lsmod` after the Oops:

Module                  Size  Used by
sr_mod                 16811  1
tcp_westwood            3136  1
ipt_REJECT              4224  7
xt_state                2560  1
iptable_filter          3072  1
ipt_owner               2176  2
ipt_MASQUERADE          3520  1
xt_tcpudp               3648  17
xt_multiport            3584  4
iptable_nat             6340  1
nf_nat                 19628  2 ipt_MASQUERADE,iptable_nat
nf_conntrack_ipv4      16144  3 iptable_nat
nf_conntrack           66032  5 xt_state,ipt_MASQUERADE,iptable_nat,nf_nat,nf_conntrack_ipv4
ip_tables              19064  2 iptable_filter,iptable_nat
x_tables               16992  8 ipt_REJECT,xt_state,ipt_owner,ipt_MASQUERADE,xt_tcpudp,xt_multiport,iptable_nat,ip_tables
cdrom                  37288  1 sr_mod
Comment 3 Nick 2008-01-01 15:07:04 UTC
The problem is easily reproducable.
I'm just doing:
# rmmod sdhci
# rmmod sr_mod mmc_core ricoh_mmc

and getting the oops.
Posting the second dmesg. But it's looks like the first one:
[   99.312537] ricoh-mmc: Controller is now re-enabled.
[   99.385831] Unable to handle kernel paging request at ffffffff8804ac80 RIP:
[   99.385836]  [<ffffffff8804ac80>]
[   99.385840] PGD 203067 PUD 207063 PMD 3df7e067 PTE 0
[   99.385844] Oops: 0010 [1] PREEMPT SMP
[   99.385847] CPU 0
[   99.385849] Modules linked in: sr_mod tcp_westwood ipt_REJECT xt_state iptable_filter ipt_owner ipt_MASQUERADE xt_tcpudp xt_multiport iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack ip_tables x_tables iwl3945 cdrom
[   99.385863] Pid: 4497, comm: modprobe Not tainted 2.6.24-rc6-git7 #3
[   99.385865] RIP: 0010:[<ffffffff8804ac80>]  [<ffffffff8804ac80>]
[   99.385868] RSP: 0018:ffff81003dcfba40  EFLAGS: 00010082
[   99.385870] RAX: ffffffff8804ac80 RBX: ffff81003f2b87e0 RCX: 0000000000000000
[   99.385872] RDX: 0000000000000000 RSI: ffff810030fb2b40 RDI: ffff81003f2b87e0
[   99.385874] RBP: ffff810030fb2b40 R08: 0000000000000000 R09: 0000000000000001
[   99.385876] R10: ffffffff8058bba0 R11: ffffffff803f7f40 R12: ffff81003dcfba50
[   99.385878] R13: ffffffff00000010 R14: 0000000100000010 R15: ffff81003f023878
[   99.385880] FS:  00002ae9802aeb00(0000) GS:ffffffff80729000(0000) knlGS:0000000000000000
[   99.385883] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   99.385884] CR2: ffffffff8804ac80 CR3: 0000000033b67000 CR4: 00000000000006e0
[   99.385886] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   99.385888] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   99.385891] Process modprobe (pid: 4497, threadinfo ffff81003dcfa000, task ffff81002d6ce000)
[   99.385892] Stack:  ffffffff80331800 ffff81003f2b87e0 ffff810030fb2b40 0000000000000001
[   99.385897]  ffff81003f023800 ffff81003f232000 ffff81003f2b87e0 0000000000000000
[   99.385902]  ffffffff803f7fa9 ffff810030fb2b40 ffff81003f023b38 ffff81003f023968
[   99.385905] Call Trace:
[   99.385912]  [<ffffffff80331800>] elv_next_request+0xe0/0x170
[   99.385917]  [<ffffffff803f7fa9>] scsi_request_fn+0x69/0x360
[   99.385921]  [<ffffffff803352a2>] blk_execute_rq_nowait+0x62/0xb0
[   99.385925]  [<ffffffff80336d6a>] blk_execute_rq+0x8a/0x100
[   99.385928]  [<ffffffff803347e5>] get_request+0x155/0x2d0
[   99.385932]  [<ffffffff803f7d64>] scsi_execute+0xe4/0x120
[   99.385934]  [<ffffffff803f7e31>] scsi_execute_req+0x91/0x100
[   99.385939]  [<ffffffff8800b70b>] :sr_mod:sr_probe+0x21b/0x630
[   99.385944]  [<ffffffff802fa0a9>] sysfs_create_link+0xa9/0x150
[   99.385948]  [<ffffffff805620a5>] klist_next+0x35/0xe0
[   99.385952]  [<ffffffff803c34cc>] driver_probe_device+0x9c/0x1b0
[   99.385955]  [<ffffffff803c37b1>] __driver_attach+0xe1/0xf0
[   99.385958]  [<ffffffff803c36d0>] __driver_attach+0x0/0xf0
[   99.385962]  [<ffffffff803c26bd>] bus_for_each_dev+0x4d/0x80
[   99.385965]  [<ffffffff803c2acf>] bus_add_driver+0xaf/0x220
[   99.385969]  [<ffffffff8801102c>] :sr_mod:init_sr+0x2c/0x47
[   99.385973]  [<ffffffff80267856>] sys_init_module+0x196/0x1a20
[   99.385979]  [<ffffffff8020bdee>] system_call+0x7e/0x83
[   99.385982]
[   99.385983]
[   99.385983] Code:  Bad RIP value.
[   99.385987] RIP  [<ffffffff8804ac80>]
[   99.385989]  RSP <ffff81003dcfba40>
[   99.385991] CR2: ffffffff8804ac80
[   99.385994] ---[ end trace 4db162cfda0579f9 ]---
[   99.385996] note: modprobe[4497] exited with preempt_count 1
[   99.385999] BUG: sleeping function called from invalid context at kernel/rwsem.c:21
[   99.386001] in_atomic():1, irqs_disabled():0
[   99.386002] INFO: lockdep is turned off.
[   99.386004] Pid: 4497, comm: modprobe Tainted: G      D 2.6.24-rc6-git7 #3
[   99.386006]
[   99.386006] Call Trace:
[   99.386010]  [<ffffffff805655c5>] down_read+0x15/0x40
[   99.386013]  [<ffffffff8023c73b>] exit_mm+0x2b/0x110
[   99.386015]  [<ffffffff8023e1af>] do_exit+0x1af/0x900
[   99.386020]  [<ffffffff80226659>] do_page_fault+0x539/0x7c0
[   99.386024]  [<ffffffff8028ea65>] activate_page+0x45/0x100
[   99.386028]  [<ffffffff8023283d>] __wake_up+0x2d/0x70
[   99.386031]  [<ffffffff8028b19c>] get_page_from_freelist+0x35c/0x570
[   99.386035]  [<ffffffff805670fd>] error_exit+0x0/0xa9
[   99.386037]  [<ffffffff803f7f40>] scsi_request_fn+0x0/0x360
[   99.386041]  [<ffffffff80331800>] elv_next_request+0xe0/0x170
[   99.386044]  [<ffffffff803f7fa9>] scsi_request_fn+0x69/0x360
[   99.386047]  [<ffffffff803352a2>] blk_execute_rq_nowait+0x62/0xb0
[   99.386050]  [<ffffffff80336d6a>] blk_execute_rq+0x8a/0x100
[   99.386052]  [<ffffffff803347e5>] get_request+0x155/0x2d0
[   99.386056]  [<ffffffff803f7d64>] scsi_execute+0xe4/0x120
[   99.386059]  [<ffffffff803f7e31>] scsi_execute_req+0x91/0x100
[   99.386063]  [<ffffffff8800b70b>] :sr_mod:sr_probe+0x21b/0x630
[   99.386067]  [<ffffffff802fa0a9>] sysfs_create_link+0xa9/0x150
[   99.386070]  [<ffffffff805620a5>] klist_next+0x35/0xe0
[   99.386073]  [<ffffffff803c34cc>] driver_probe_device+0x9c/0x1b0
[   99.386075]  [<ffffffff803c37b1>] __driver_attach+0xe1/0xf0
[   99.386078]  [<ffffffff803c36d0>] __driver_attach+0x0/0xf0
[   99.386081]  [<ffffffff803c26bd>] bus_for_each_dev+0x4d/0x80
[   99.386084]  [<ffffffff803c2acf>] bus_add_driver+0xaf/0x220
[   99.386087]  [<ffffffff8801102c>] :sr_mod:init_sr+0x2c/0x47
[   99.386090]  [<ffffffff80267856>] sys_init_module+0x196/0x1a20
[   99.386096]  [<ffffffff8020bdee>] system_call+0x7e/0x83
[   99.386098]
Comment 4 Nick 2008-01-01 15:18:22 UTC
The problem is reproducable (with the same dmesg) and with just:
# rmmod sdhci
# rmmod sr_mod mmc_core
(with only 2 first modules in the second cmd)

BUT just "rmmod sr_mod" doesn't causes the problem!
Even more. "sr_mod" is getting loaded immediately again on "rmmod sr_mod"!
I've tried that several times. Here is the dmesg:
[   97.749874] sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
[   97.749928] sr 1:0:0:0: Attached scsi CD-ROM sr0
[   97.997312] sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
[   97.997368] sr 1:0:0:0: Attached scsi CD-ROM sr0
[   98.650586] sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
[   98.650641] sr 1:0:0:0: Attached scsi CD-ROM sr0
     (just running "rmmod sdhci" here)
[  101.836267] ACPI: PCI interrupt for device 0000:02:01.1 disabled

[  100.133198] sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
[  100.133256] sr 1:0:0:0: Attached scsi CD-ROM sr0
[  100.205790] sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
[  100.205846] sr 1:0:0:0: Attached scsi CD-ROM sr0

So, "sr_mod" is getting loaded permanently after unloading.
But unloading altogether it with "mmc_core" causes the prob.
Comment 5 Nick 2008-01-01 15:25:17 UTC
The problem doesn't appear on removing only "mmc_core" module.
So, only removing altogether "sr_mod mmc_core" causes oops.
Comment 6 Nick 2008-01-01 15:36:37 UTC
The problem appears and if just physically remove sdhci.ko file _before_ reboot.
So, when system is booted sr_mod and mmc_sore are loaded though.
And the very first command causes the problem:
# rmmod sr_mod mmc_core

NOTE!!
The problem does NOT appear in 2.6.24-rc4 (!!). The modules are removed ok there.
So, the bug is fresh.
Comment 7 Anonymous Emailer 2008-01-01 18:11:30 UTC
Reply-To: akpm@linux-foundation.org

On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9674
> 
>            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc,
>                     mmc_core

Guys, this is a very recent regression.  Could you please take a look, see
if it's due to mmc, block or scsi changes?

Thanks.
Comment 8 Anonymous Emailer 2008-01-01 19:25:12 UTC
Reply-To: James.Bottomley@HansenPartnership.com


On Tue, 2008-01-01 at 18:10 -0800, Andrew Morton wrote:
> On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org
> wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=9674
> > 
> >            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod, ricoh_mmc,
> >                     mmc_core
> 
> Guys, this is a very recent regression.  Could you please take a look, see
> if it's due to mmc, block or scsi changes?

There's not a lot of information to go on.  The stack trace looks bogus,
so I guess the kernel is compiled without a frame pointer.  However, it
does look like the initial insertion of sr_mod is going through and it
generates a command which gets into scsi_request_fn and then indirects
through a bogus queueucommand pointer.

What's the actual underlying device the cdrom is attached to?

There's no real changes to SCSI in this area from 2.6.24-rc4 ...
however, the reinsertion is suggestive, it's like the removal is
retriggering a module request for some reason.

James
Comment 9 Pierre Ossman 2008-01-01 22:54:06 UTC
FWIW, mmc_core has zero commits since -rc4, so if it's a bug there it is something that has been dormant for some time.
Comment 10 Nick 2008-01-01 23:16:07 UTC
here is the dump from the same kernel recompiled with frame pointer:

[  101.359083] Unable to handle kernel paging request at ffffffff88021cc0 RIP:
[  101.359092]  [<ffffffff88021cc0>]
[  101.359099] PGD 203067 PUD 207063 PMD 3d34a067 PTE 0
[  101.359108] Oops: 0010 [1] PREEMPT SMP
[  101.359115] CPU 0
[  101.359118] Modules linked in: sr_mod tcp_westwood ipt_REJECT xt_state iptable_filter ipt_owner ipt_MASQUERADE xt_tcpudp xt_multiport iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack ip_tables x_tables iwl3945 ricoh_mmc cdrom
[  101.359150] Pid: 4496, comm: modprobe Not tainted 2.6.24-rc6-git7 #5
[  101.359154] RIP: 0010:[<ffffffff88021cc0>]  [<ffffffff88021cc0>]
[  101.359159] RSP: 0018:ffff81002b457970  EFLAGS: 00010086
[  101.359163] RAX: ffffffff88021cc0 RBX: ffff81003f1627e0 RCX: ffff81003f023b38
[  101.359167] RDX: 0000000000000000 RSI: ffff810030efd000 RDI: ffff81003f1627e0
[  101.359171] RBP: ffff81002b4579b8 R08: 0000000000000001 R09: 0000000000000001
[  101.359175] R10: 0000000000000000 R11: 0000000000000000 R12: ffff810030efd000
[  101.359179] R13: ffff81002b457988 R14: ffffffff00000010 R15: 0000000100000010
[  101.359185] FS:  00002adcf7ea0b00(0000) GS:ffffffff80733000(0000) knlGS:0000000000000000
[  101.359189] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  101.359193] CR2: ffffffff88021cc0 CR3: 000000002b497000 CR4: 00000000000006e0
[  101.359197] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  101.359201] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  101.359206] Process modprobe (pid: 4496, threadinfo ffff81002b456000, task ffff81003de1ef50)
[  101.359210] Stack:  ffffffff80333a98 0000000000000086 ffff81003f023af8 ffff810030efd000
[  101.359221]  ffff81003f1627e0 ffff81003f023800 ffff81003e31c000 ffff81003f1627e0
[  101.359230]  0000000000000000 ffff81002b457a08 ffffffff803fe7e1 ffff81003f023b38
[  101.359237] Call Trace:
[  101.359248]  [<ffffffff80333a98>] elv_next_request+0xe8/0x180
[  101.359256]  [<ffffffff803fe7e1>] scsi_request_fn+0x71/0x380
[  101.359264]  [<ffffffff803375b8>] __generic_unplug_device+0x28/0x30
[  101.359270]  [<ffffffff80337623>] blk_execute_rq_nowait+0x63/0xb0
[  101.359276]  [<ffffffff80339113>] blk_execute_rq+0x73/0xe0
[  101.359283]  [<ffffffff80337775>] get_request_wait+0x25/0x120
[  101.359288]  [<ffffffff80337896>] blk_get_request+0x26/0x80
[  101.359296]  [<ffffffff803fe5b2>] scsi_execute+0xe2/0x110
[  101.359301]  [<ffffffff803fe661>] scsi_execute_req+0x81/0xf0
[  101.359312]  [<ffffffff8800d713>] :sr_mod:sr_probe+0x1e3/0x630
[  101.359323]  [<ffffffff803c8d01>] driver_probe_device+0xa1/0x1c0
[  101.359329]  [<ffffffff803c8ff5>] __driver_attach+0xe5/0xf0
[  101.359334]  [<ffffffff803c8f10>] __driver_attach+0x0/0xf0
[  101.359342]  [<ffffffff803c7ee3>] bus_for_each_dev+0x53/0x80
[  101.359348]  [<ffffffff803c8b3c>] driver_attach+0x1c/0x20
[  101.359353]  [<ffffffff803c8305>] bus_add_driver+0xa5/0x210
[  101.359360]  [<ffffffff803c922a>] driver_register+0x4a/0x80
[  101.359367]  [<ffffffff80402241>] scsi_register_driver+0x11/0x20
[  101.359374]  [<ffffffff8801303c>] :sr_mod:init_sr+0x3c/0x5c
[  101.359382]  [<ffffffff80268a23>] sys_init_module+0x153/0x1a80
[  101.359395]  [<ffffffff8020bdee>] system_call+0x7e/0x83
[  101.359399]
[  101.359401]
[  101.359402] Code:  Bad RIP value.
[  101.359409] RIP  [<ffffffff88021cc0>]
[  101.359414]  RSP <ffff81002b457970>
[  101.359416] CR2: ffffffff88021cc0
[  101.359425] ---[ end trace c303fca3a91ba9a8 ]---
[  101.359429] note: modprobe[4496] exited with preempt_count 1
[  101.359438] BUG: sleeping function called from invalid context at kernel/rwsem.c:21
[  101.359451] in_atomic():1, irqs_disabled():0
[  101.359460] INFO: lockdep is turned off.
[  101.359469] Pid: 4496, comm: modprobe Tainted: G      D 2.6.24-rc6-git7 #5
[  101.359480]
[  101.359482] Call Trace:
[  101.359494]  [<ffffffff80232ee2>] __might_sleep+0xc2/0xf0
[  101.359509]  [<ffffffff8056f56d>] down_read+0x1d/0x50
[  101.359521]  [<ffffffff8023cb80>] exit_mm+0x30/0x110
[  101.359532]  [<ffffffff8023e6d7>] do_exit+0x1c7/0x950
[  101.359546]  [<ffffffff802268b8>] do_page_fault+0x5a8/0x860
[  101.359564]  [<ffffffff805712ad>] error_exit+0x0/0xa9
[  101.359578]  [<ffffffff80333a98>] elv_next_request+0xe8/0x180
[  101.359591]  [<ffffffff803fe7e1>] scsi_request_fn+0x71/0x380
[  101.359606]  [<ffffffff803375b8>] __generic_unplug_device+0x28/0x30
[  101.359619]  [<ffffffff80337623>] blk_execute_rq_nowait+0x63/0xb0
[  101.359633]  [<ffffffff80339113>] blk_execute_rq+0x73/0xe0
[  101.359647]  [<ffffffff80337775>] get_request_wait+0x25/0x120
[  101.359660]  [<ffffffff80337896>] blk_get_request+0x26/0x80
[  101.359674]  [<ffffffff803fe5b2>] scsi_execute+0xe2/0x110
[  101.359688]  [<ffffffff803fe661>] scsi_execute_req+0x81/0xf0
[  101.359703]  [<ffffffff8800d713>] :sr_mod:sr_probe+0x1e3/0x630
[  101.359719]  [<ffffffff803c8d01>] driver_probe_device+0xa1/0x1c0
[  101.359731]  [<ffffffff803c8ff5>] __driver_attach+0xe5/0xf0
[  101.359742]  [<ffffffff803c8f10>] __driver_attach+0x0/0xf0
[  101.359756]  [<ffffffff803c7ee3>] bus_for_each_dev+0x53/0x80
[  101.359768]  [<ffffffff803c8b3c>] driver_attach+0x1c/0x20
[  101.359780]  [<ffffffff803c8305>] bus_add_driver+0xa5/0x210
[  101.359793]  [<ffffffff803c922a>] driver_register+0x4a/0x80
[  101.359807]  [<ffffffff80402241>] scsi_register_driver+0x11/0x20
[  101.359821]  [<ffffffff8801303c>] :sr_mod:init_sr+0x3c/0x5c
[  101.359834]  [<ffffffff80268a23>] sys_init_module+0x153/0x1a80
[  101.359853]  [<ffffffff8020bdee>] system_call+0x7e/0x83
[  101.359865]
Comment 11 Rafael J. Wysocki 2008-01-02 04:20:04 UTC
On Wednesday, 2 of January 2008, James Bottomley wrote:
> 
> On Tue, 2008-01-01 at 18:10 -0800, Andrew Morton wrote:
> > On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org
> wrote:
> > 
> > > http://bugzilla.kernel.org/show_bug.cgi?id=9674
> > > 
> > >            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod,
> ricoh_mmc,
> > >                     mmc_core
> > 
> > Guys, this is a very recent regression.  Could you please take a look, see
> > if it's due to mmc, block or scsi changes?
> 
> There's not a lot of information to go on.  The stack trace looks bogus,
> so I guess the kernel is compiled without a frame pointer.

The bug report has been updated with a stack trace from a kernel compiled
with a frame pointer.  Please have a look.
Comment 12 Anonymous Emailer 2008-01-02 07:41:45 UTC
Reply-To: James.Bottomley@HansenPartnership.com


On Wed, 2008-01-02 at 13:21 +0100, Rafael J. Wysocki wrote:
> On Wednesday, 2 of January 2008, James Bottomley wrote:
> > 
> > On Tue, 2008-01-01 at 18:10 -0800, Andrew Morton wrote:
> > > On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org
> wrote:
> > > 
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=9674
> > > > 
> > > >            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod,
> ricoh_mmc,
> > > >                     mmc_core
> > > 
> > > Guys, this is a very recent regression.  Could you please take a look,
> see
> > > if it's due to mmc, block or scsi changes?
> > 
> > There's not a lot of information to go on.  The stack trace looks bogus,
> > so I guess the kernel is compiled without a frame pointer.
> 
> The bug report has been updated with a stack trace from a kernel compiled
> with a frame pointer.  Please have a look.

Please, please don't do this.  Filing something in bugzilla is
tantamount to putting it in the file and forget folder.  The reason I
cc'd the SCSI mailing list and asked for more details is so that we get
the email flow that might trigger direct interaction between the
reporter and someone on the list who recognised the symptoms.  

Let me say again, catagorically, that if you want to give a bug the best
chance of being fixed, the correct flow of information is:

file a bugzilla and note the bugid.
Then email a complete report to the relevant list, but add [BUG <bugid>]
to the subject line and cc bugme-daemon@bugzilla.kernel.org  If you do
this, bugzilla will keep track of the entire discussion as it progresses
and allow those who track bugs through bugzilla to get a pretty accurate
idea of the status.  You should never need to touch bugzilla again once
the initial bug report is filed: all future information flow is via the
mailing lists.

Also, using urls unless for historical purposes is also a killer.  Many
people travel, and their MO is to download the email and read it on the
plane/train/whatever.  If you embed a url containing critical
information, the email gets marked as read, but since I can't get to the
information, nothing happens.  Then it gets forgotten.

This is the relevant piece of information that should have been on the
mailing list:


[  101.359083] Unable to handle kernel paging request at ffffffff88021cc0 RIP:
[  101.359092]  [<ffffffff88021cc0>]
[  101.359099] PGD 203067 PUD 207063 PMD 3d34a067 PTE 0
[  101.359108] Oops: 0010 [1] PREEMPT SMP
[  101.359115] CPU 0
[  101.359118] Modules linked in: sr_mod tcp_westwood ipt_REJECT xt_state
iptable_filter ipt_owner ipt_MASQUERADE xt_tcpudp xt_multiport iptable_nat
nf_nat nf_conntrack_ipv4 nf_conntrack ip_tables x_tables iwl3945 ricoh_mmc
cdrom
[  101.359150] Pid: 4496, comm: modprobe Not tainted 2.6.24-rc6-git7 #5
[  101.359154] RIP: 0010:[<ffffffff88021cc0>]  [<ffffffff88021cc0>]
[  101.359159] RSP: 0018:ffff81002b457970  EFLAGS: 00010086
[  101.359163] RAX: ffffffff88021cc0 RBX: ffff81003f1627e0 RCX:
ffff81003f023b38
[  101.359167] RDX: 0000000000000000 RSI: ffff810030efd000 RDI:
ffff81003f1627e0
[  101.359171] RBP: ffff81002b4579b8 R08: 0000000000000001 R09:
0000000000000001
[  101.359175] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff810030efd000
[  101.359179] R13: ffff81002b457988 R14: ffffffff00000010 R15:
0000000100000010
[  101.359185] FS:  00002adcf7ea0b00(0000) GS:ffffffff80733000(0000)
knlGS:0000000000000000
[  101.359189] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  101.359193] CR2: ffffffff88021cc0 CR3: 000000002b497000 CR4:
00000000000006e0
[  101.359197] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  101.359201] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  101.359206] Process modprobe (pid: 4496, threadinfo ffff81002b456000, task
ffff81003de1ef50)
[  101.359210] Stack:  ffffffff80333a98 0000000000000086 ffff81003f023af8
ffff810030efd000
[  101.359221]  ffff81003f1627e0 ffff81003f023800 ffff81003e31c000
ffff81003f1627e0
[  101.359230]  0000000000000000 ffff81002b457a08 ffffffff803fe7e1
ffff81003f023b38
[  101.359237] Call Trace:
[  101.359248]  [<ffffffff80333a98>] elv_next_request+0xe8/0x180
[  101.359256]  [<ffffffff803fe7e1>] scsi_request_fn+0x71/0x380
[  101.359264]  [<ffffffff803375b8>] __generic_unplug_device+0x28/0x30
[  101.359270]  [<ffffffff80337623>] blk_execute_rq_nowait+0x63/0xb0
[  101.359276]  [<ffffffff80339113>] blk_execute_rq+0x73/0xe0
[  101.359283]  [<ffffffff80337775>] get_request_wait+0x25/0x120
[  101.359288]  [<ffffffff80337896>] blk_get_request+0x26/0x80
[  101.359296]  [<ffffffff803fe5b2>] scsi_execute+0xe2/0x110
[  101.359301]  [<ffffffff803fe661>] scsi_execute_req+0x81/0xf0
[  101.359312]  [<ffffffff8800d713>] :sr_mod:sr_probe+0x1e3/0x630
[  101.359323]  [<ffffffff803c8d01>] driver_probe_device+0xa1/0x1c0
[  101.359329]  [<ffffffff803c8ff5>] __driver_attach+0xe5/0xf0
[  101.359334]  [<ffffffff803c8f10>] __driver_attach+0x0/0xf0
[  101.359342]  [<ffffffff803c7ee3>] bus_for_each_dev+0x53/0x80
[  101.359348]  [<ffffffff803c8b3c>] driver_attach+0x1c/0x20
[  101.359353]  [<ffffffff803c8305>] bus_add_driver+0xa5/0x210
[  101.359360]  [<ffffffff803c922a>] driver_register+0x4a/0x80
[  101.359367]  [<ffffffff80402241>] scsi_register_driver+0x11/0x20
[  101.359374]  [<ffffffff8801303c>] :sr_mod:init_sr+0x3c/0x5c
[  101.359382]  [<ffffffff80268a23>] sys_init_module+0x153/0x1a80
[  101.359395]  [<ffffffff8020bdee>] system_call+0x7e/0x83
[  101.359399]
[  101.359401]
[  101.359402] Code:  Bad RIP value.
[  101.359409] RIP  [<ffffffff88021cc0>]
[  101.359414]  RSP <ffff81002b457970>
[  101.359416] CR2: ffffffff88021cc0
[  101.359425] ---[ end trace c303fca3a91ba9a8 ]---
[  101.359429] note: modprobe[4496] exited with preempt_count 1
[  101.359438] BUG: sleeping function called from invalid context at
kernel/rwsem.c:21
[  101.359451] in_atomic():1, irqs_disabled():0
[  101.359460] INFO: lockdep is turned off.
[  101.359469] Pid: 4496, comm: modprobe Tainted: G      D 2.6.24-rc6-git7 #5
[  101.359480]
[  101.359482] Call Trace:
[  101.359494]  [<ffffffff80232ee2>] __might_sleep+0xc2/0xf0
[  101.359509]  [<ffffffff8056f56d>] down_read+0x1d/0x50
[  101.359521]  [<ffffffff8023cb80>] exit_mm+0x30/0x110
[  101.359532]  [<ffffffff8023e6d7>] do_exit+0x1c7/0x950
[  101.359546]  [<ffffffff802268b8>] do_page_fault+0x5a8/0x860
[  101.359564]  [<ffffffff805712ad>] error_exit+0x0/0xa9
[  101.359578]  [<ffffffff80333a98>] elv_next_request+0xe8/0x180
[  101.359591]  [<ffffffff803fe7e1>] scsi_request_fn+0x71/0x380
[  101.359606]  [<ffffffff803375b8>] __generic_unplug_device+0x28/0x30
[  101.359619]  [<ffffffff80337623>] blk_execute_rq_nowait+0x63/0xb0
[  101.359633]  [<ffffffff80339113>] blk_execute_rq+0x73/0xe0
[  101.359647]  [<ffffffff80337775>] get_request_wait+0x25/0x120
[  101.359660]  [<ffffffff80337896>] blk_get_request+0x26/0x80
[  101.359674]  [<ffffffff803fe5b2>] scsi_execute+0xe2/0x110
[  101.359688]  [<ffffffff803fe661>] scsi_execute_req+0x81/0xf0
[  101.359703]  [<ffffffff8800d713>] :sr_mod:sr_probe+0x1e3/0x630
[  101.359719]  [<ffffffff803c8d01>] driver_probe_device+0xa1/0x1c0
[  101.359731]  [<ffffffff803c8ff5>] __driver_attach+0xe5/0xf0
[  101.359742]  [<ffffffff803c8f10>] __driver_attach+0x0/0xf0
[  101.359756]  [<ffffffff803c7ee3>] bus_for_each_dev+0x53/0x80
[  101.359768]  [<ffffffff803c8b3c>] driver_attach+0x1c/0x20
[  101.359780]  [<ffffffff803c8305>] bus_add_driver+0xa5/0x210
[  101.359793]  [<ffffffff803c922a>] driver_register+0x4a/0x80
[  101.359807]  [<ffffffff80402241>] scsi_register_driver+0x11/0x20
[  101.359821]  [<ffffffff8801303c>] :sr_mod:init_sr+0x3c/0x5c
[  101.359834]  [<ffffffff80268a23>] sys_init_module+0x153/0x1a80
[  101.359853]  [<ffffffff8020bdee>] system_call+0x7e/0x83
[  101.359865]

This is clearly showing it was in the request function and probably
indirected via a bad pointer.

I think something in the remove chain triggered a reinsertion of the
module (this would be something in the user land hotplug scripts or
something in the kernel triggering it).  Now, for the second time of
asking, what is the underlying driver or HBA the cdrom is attached to?

James
Comment 13 Nick 2008-01-02 07:46:50 UTC
> Now, for the second time of
> asking, what is the underlying driver or HBA the cdrom is attached to?

Sorry for not responding at the first time.
It's multi DVD-RW drive.
Comment 14 Anonymous Emailer 2008-01-02 07:50:32 UTC
Reply-To: pw@osc.edu

James.Bottomley@HansenPartnership.com wrote on Tue, 01 Jan 2008 21:24 -0600:
> 
> On Tue, 2008-01-01 at 18:10 -0800, Andrew Morton wrote:
> > On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org
> wrote:
> > 
> > > http://bugzilla.kernel.org/show_bug.cgi?id=9674
> > > 
> > >            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod,
> ricoh_mmc,
> > >                     mmc_core
> > 
> > Guys, this is a very recent regression.  Could you please take a look, see
> > if it's due to mmc, block or scsi changes?
> 
> There's not a lot of information to go on.  The stack trace looks bogus,
> so I guess the kernel is compiled without a frame pointer.  However, it
> does look like the initial insertion of sr_mod is going through and it
> generates a command which gets into scsi_request_fn and then indirects
> through a bogus queueucommand pointer.

Bogus prep_rq_fn actually.

> What's the actual underlying device the cdrom is attached to?
> 
> There's no real changes to SCSI in this area from 2.6.24-rc4 ...
> however, the reinsertion is suggestive, it's like the removal is
> retriggering a module request for some reason.

Here's a guess.  When sr_mod is removed, it looks like the request
queue prep_rq_fn is still pointing to the now nonexistent
sr_prep_fn.  This may have been due to a commit that went in early
2.6.24:

    commit 7f9a6bc4e9d59e7fcf03ed23f60cd81ca5d80b65
    Author: James Bottomley <James.Bottomley@steeleye.com>
    Date:   Sat Aug 4 10:06:25 2007 -0500

    [SCSI] move ULD attachment into the prep function
    
    One of the intents of the block prep function was to allow ULDs to use
    it for preprocessing.  The original SCSI model was to have a single prep
    function and add a pointer indirect filter to build the necessary
    commands.  This patch reverses that, does away with the init_command
    field of the scsi_driver structure and makes ULDs attach directly to the
    prep function instead.  The value is really that it allows us to begin
    to separate the ULDs from the SCSI mid layer (as long as they don't use
    any core functions---which is hard at the moment---a ULD doesn't even
    need SCSI to bind).
    
    Acked-by: Jens Axboe <jens.axboe@oracle.com>
    Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

When the module is re-inserted, it does a few SCSI commands before
setting up the new prep_rq_fn, presumably hitting this bogus
pointer.

One fix would be to have sr remember the original prep function and
restore it in sr_kref_release.  Sd and a few other drivers have this
issue.  Ide-cd bothers to set prep_rq_fn to NULL as it releases
the device.

		-- Pete
Comment 15 Anonymous Emailer 2008-01-02 08:34:03 UTC
Reply-To: James.Bottomley@HansenPartnership.com


On Wed, 2008-01-02 at 10:49 -0500, Pete Wyckoff wrote:
> James.Bottomley@HansenPartnership.com wrote on Tue, 01 Jan 2008 21:24 -0600:
> > 
> > On Tue, 2008-01-01 at 18:10 -0800, Andrew Morton wrote:
> > > On Tue,  1 Jan 2008 14:55:45 -0800 (PST) bugme-daemon@bugzilla.kernel.org
> wrote:
> > > 
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=9674
> > > > 
> > > >            Summary: Oops during rmmod'ing modeuls sdhci, sr_mod,
> ricoh_mmc,
> > > >                     mmc_core
> > > 
> > > Guys, this is a very recent regression.  Could you please take a look,
> see
> > > if it's due to mmc, block or scsi changes?
> > 
> > There's not a lot of information to go on.  The stack trace looks bogus,
> > so I guess the kernel is compiled without a frame pointer.  However, it
> > does look like the initial insertion of sr_mod is going through and it
> > generates a command which gets into scsi_request_fn and then indirects
> > through a bogus queueucommand pointer.
> 
> Bogus prep_rq_fn actually.
> 
> > What's the actual underlying device the cdrom is attached to?
> > 
> > There's no real changes to SCSI in this area from 2.6.24-rc4 ...
> > however, the reinsertion is suggestive, it's like the removal is
> > retriggering a module request for some reason.
> 
> Here's a guess.  When sr_mod is removed, it looks like the request
> queue prep_rq_fn is still pointing to the now nonexistent
> sr_prep_fn.  This may have been due to a commit that went in early
> 2.6.24:
> 
>     commit 7f9a6bc4e9d59e7fcf03ed23f60cd81ca5d80b65
>     Author: James Bottomley <James.Bottomley@steeleye.com>
>     Date:   Sat Aug 4 10:06:25 2007 -0500
> 
>     [SCSI] move ULD attachment into the prep function
>     
>     One of the intents of the block prep function was to allow ULDs to use
>     it for preprocessing.  The original SCSI model was to have a single prep
>     function and add a pointer indirect filter to build the necessary
>     commands.  This patch reverses that, does away with the init_command
>     field of the scsi_driver structure and makes ULDs attach directly to the
>     prep function instead.  The value is really that it allows us to begin
>     to separate the ULDs from the SCSI mid layer (as long as they don't use
>     any core functions---which is hard at the moment---a ULD doesn't even
>     need SCSI to bind).
>     
>     Acked-by: Jens Axboe <jens.axboe@oracle.com>
>     Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
> 
> When the module is re-inserted, it does a few SCSI commands before
> setting up the new prep_rq_fn, presumably hitting this bogus
> pointer.
> 
> One fix would be to have sr remember the original prep function and
> restore it in sr_kref_release.  Sd and a few other drivers have this
> issue.  Ide-cd bothers to set prep_rq_fn to NULL as it releases
> the device.

Bingo .. that's why we ask the list, thanks Pete!

I don't think the fix is the correct one, but I've attached what I think
is the correct fix (basically, there's a bus callback we can use to
ensure the right thing always gets done rather than relying on drivers
doing it in their own release methods, that way they can't forget).

The reason it was showing up in -rc4 I suspect is because something was
structurally altering the module stack, and the address that sr_mod was
loaded was changed, so the prep_fn as Pete said was pointing into bogus
address space.

The way to trigger this bug 100% of the time is to rmmod sr_mod and then
send an inquiry (or another command) to the device using the sg node.

James

---

Index: BUILD-2.6/drivers/scsi/scsi_lib.c
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_lib.c	2008-01-01 10:13:33.000000000 -0600
+++ BUILD-2.6/drivers/scsi/scsi_lib.c	2008-01-02 10:17:51.000000000 -0600
@@ -1324,7 +1324,7 @@ int scsi_prep_return(struct request_queu
 }
 EXPORT_SYMBOL(scsi_prep_return);
 
-static int scsi_prep_fn(struct request_queue *q, struct request *req)
+int scsi_prep_fn(struct request_queue *q, struct request *req)
 {
 	struct scsi_device *sdev = q->queuedata;
 	int ret = BLKPREP_KILL;
Index: BUILD-2.6/drivers/scsi/scsi_priv.h
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_priv.h	2007-11-03 09:08:46.000000000 -0500
+++ BUILD-2.6/drivers/scsi/scsi_priv.h	2008-01-02 10:20:09.000000000 -0600
@@ -74,6 +74,9 @@ extern struct request_queue *scsi_alloc_
 extern void scsi_free_queue(struct request_queue *q);
 extern int scsi_init_queue(void);
 extern void scsi_exit_queue(void);
+struct request_queue;
+struct request;
+extern int scsi_prep_fn(struct request_queue *, struct request *);
 
 /* scsi_proc.c */
 #ifdef CONFIG_SCSI_PROC_FS
Index: BUILD-2.6/drivers/scsi/scsi_sysfs.c
===================================================================
--- BUILD-2.6.orig/drivers/scsi/scsi_sysfs.c	2007-11-03 10:08:02.000000000 -0500
+++ BUILD-2.6/drivers/scsi/scsi_sysfs.c	2008-01-02 10:31:33.000000000 -0600
@@ -373,12 +373,24 @@ static int scsi_bus_resume(struct device
 	return err;
 }
 
+static int scsi_bus_remove(struct device *dev)
+{
+	struct scsi_device *sdev = to_scsi_device(dev);
+
+	/* reset the prep_fn back to the default since the
+	 * driver may have altered it and it's being removed */
+	blk_queue_prep_rq(sdev->request_queue, scsi_prep_fn);
+
+	return 0;
+}
+
 struct bus_type scsi_bus_type = {
         .name		= "scsi",
         .match		= scsi_bus_match,
 	.uevent		= scsi_bus_uevent,
 	.suspend	= scsi_bus_suspend,
 	.resume		= scsi_bus_resume,
+	.remove		= scsi_bus_remove,
 };
 
 int scsi_sysfs_register(void)
Comment 16 Rafael J. Wysocki 2008-01-02 11:48:25 UTC
Corrected patch: http://marc.info/?l=linux-scsi&m=119929408920573&w=4
Comment 17 Nick 2008-01-02 13:23:55 UTC
Rafael,

the patch does fix the prob (!!)
Thanks

But dmesg shows this strange line (pasting and sr_mod initialization too just to be clear):

[  100.060789] Driver 'sr' needs updating - please use bus_type methods
[  100.070853] sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
[  100.070981] sr 1:0:0:0: Attached scsi CD-ROM sr0

Guess, something still need to be updated
Comment 18 Rafael J. Wysocki 2008-01-02 13:53:53 UTC
Please don't remove the Bug #9243 dependency, thanks.
Comment 19 Nick 2008-01-03 12:09:28 UTC
Sorry for reopening this but the dmesg line from my previous post makes me think the fix is not the best:

[  100.060789] Driver 'sr' needs updating - please use bus_type methods
Comment 20 Adrian Bunk 2008-01-03 13:15:38 UTC
The patch is now commit 751bf4d7865e4ced406be93b04c7436d866d3684 in Linus' tree.
Comment 21 Nick 2008-01-03 13:23:28 UTC
Ok Adrian,

so, I'm assuming the notice is something to be fixed later?
Comment 22 Adrian Bunk 2008-01-03 13:26:45 UTC
The notice should be unrelated.
Comment 23 Nick 2008-01-03 13:51:38 UTC
Sorry, but you are wrong Adrian.

Here is the listing of drivers/base/driver.c::driver_register() function which is called in drivers/scsi/scsi_sysfs.c::scsi_register_driver()

--------- CUT -------------
int driver_register(struct device_driver * drv)
{
        if ((drv->bus->probe && drv->probe) ||
            (drv->bus->remove && drv->remove) ||
            (drv->bus->shutdown && drv->shutdown)) {
                printk(KERN_WARNING "Driver '%s' needs updating - please use bus_type methods\n", drv->name);
        }
        klist_init(&drv->klist_devices, NULL, NULL);
        return bus_add_driver(drv);
}
--------- /CUT -------------

As you can see the notice is printed once 'probe', 'remove' or 'shutdown' fields are defined.
The patch adds 'remove' field to 'struct bus_type' at causes the notice to be printed.
So, it's related and I guess should be rewritten.

Please correct me if I'm wrong.
Comment 24 Rafael J. Wysocki 2008-01-03 14:40:24 UTC
I think this is a clean up matter unrelated to the issue fixed by the patch.  It can be done in a separate patch in the future.
Comment 25 Nick 2008-01-03 14:46:38 UTC
Rafael,

yes, it's unrelated directly.

But the patch is using some old framework which is not good.
And that check in driver_register() monitors such attempts to use those old
things.

Guess we just can't add such deprecated code that causes such warnings to appear.
It's obvious that we can't accept such deprecated hacks.
Comment 26 Adrian Bunk 2008-01-03 14:54:45 UTC
(In reply to comment #25)
> Rafael,
> 
> yes, it's unrelated directly.
> 
> But the patch is using some old framework which is not good.
> And that check in driver_register() monitors such attempts to use those old
> things.
> 
> Guess we just can't add such deprecated code that causes such warnings to
> appear.
> It's obvious that we can't accept such deprecated hacks.

The one thing that is obvious is that this is a regression that had to be fixed quickly (since 2.6.24 might soon be released).
Comment 27 Nick 2008-01-03 14:59:13 UTC
Ok

I'm giving up.
It was just a small regression and it's fixed.

Note You need to log in before you can comment on or make changes to this bug.