Bug 38602 - suspicious rcu_dereference_check() usage in block/cfq-iosched.c:2776
suspicious rcu_dereference_check() usage in block/cfq-iosched.c:2776
Status: CLOSED CODE_FIX
Product: IO/Storage
Classification: Unclassified
Component: Block Layer
All Linux
: P1 normal
Assigned To: Jens Axboe
:
Depends on:
Blocks: 36912
  Show dependency treegraph
 
Reported: 2011-07-01 04:18 UTC by Christian Casteyde
Modified: 2011-07-13 18:43 UTC (History)
5 users (show)

See Also:
Kernel Version: 3.0-rc5
Tree: Mainline
Regression: Yes


Attachments
dmesg-kernel-3.0-0.rc5.git0.1.fc16 (80.50 KB, text/plain)
2011-07-01 22:05 UTC, Joshua Covington
Details

Description Christian Casteyde 2011-07-01 04:18:40 UTC
Acer Aspire 7750G
Core i7 in 64bit mode
Slackware64 13.37

Since 3.0-rc5, I get the following at boot:
device-mapper: multipath round-robin: version 1.0.0 loaded
scsi 6:0:0:0: Direct-Access     Generic- Multi-Card       1.00 PQ: 0 ANSI: 0 CCS

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
block/cfq-iosched.c:2776 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
3 locks held by scsi_scan_6/1552:
 #0:  (&shost->scan_mutex){+.+.+.}, at: [<ffffffff8145efca>] scsi_scan_host_selected+0x5a/0x150
 #1:  (&eq->sysfs_lock){+.+...}, at: [<ffffffff812a5032>] elevator_exit+0x22/0x60
 #2:  (&(&q->__queue_lock)->rlock){-.-...}, at: [<ffffffff812b6233>] cfq_exit_queue+0x43/0x190

stack backtrace:
Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17
Call Trace:
 [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0
 [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120
 [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190
 [<ffffffff812a5046>] elevator_exit+0x36/0x60
 [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60
 [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10
 [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0
 [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10
 [<ffffffff817da069>] ? error_exit+0x29/0xb0
 [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
 [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680
 [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 [<ffffffff817da069>] ? error_exit+0x29/0xb0
 [<ffffffff812bcc60>] ? kobject_del+0x40/0x40
 [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0
 [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150
 [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90
 [<ffffffff8145f170>] do_scan_async+0x20/0x160
 [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90
 [<ffffffff810975b6>] kthread+0xa6/0xb0
 [<ffffffff817db154>] kernel_thread_helper+0x4/0x10
 [<ffffffff81066430>] ? finish_task_switch+0x80/0x110
 [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe
 [<ffffffff81097510>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff817db150>] ? gs_change+0xb/0xb
Comment 1 Joshua Covington 2011-07-01 22:05:22 UTC
Created attachment 64442 [details]
dmesg-kernel-3.0-0.rc5.git0.1.fc16

I get a very similar message during boot with kernel-3.0-0.rc5.git0.1.fc16:

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
block/cfq-iosched.c:2776 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
3 locks held by scsi_scan_0/208:
 #0:  (&shost->scan_mutex){+.+.+.}, at: [<ffffffff8133d968>] scsi_scan_host_selected+0xbf/0x191
 #1:  (&eq->sysfs_lock){+.+...}, at: [<ffffffff8123902a>] elevator_exit+0x1d/0x4e
 #2:  (&(&q->__queue_lock)->rlock){..-...}, at: [<ffffffff8125032f>] cfq_exit_queue+0x47/0x179

stack backtrace:
Pid: 208, comm: scsi_scan_0 Not tainted 3.0-0.rc5.git0.1.fc16.x86_64 #1
Call Trace:
 [<ffffffff81086e4d>] lockdep_rcu_dereference+0xa8/0xb0
 [<ffffffff81250227>] __cfq_exit_single_io_context+0x78/0xd7
 [<ffffffff81250353>] cfq_exit_queue+0x6b/0x179
 [<ffffffff8123903e>] elevator_exit+0x31/0x4e
 [<ffffffff8123d501>] blk_cleanup_queue+0x4f/0x68
 [<ffffffff8133b931>] scsi_free_queue+0xe/0x10
 [<ffffffff8133efb2>] __scsi_remove_device+0xac/0xb9
 [<ffffffff8133cee8>] scsi_probe_and_add_lun+0xa6e/0xaab
 [<ffffffff8133d5ff>] __scsi_scan_target+0x580/0x5d2
 [<ffffffff81088007>] ? mark_lock+0x2d/0x220
 [<ffffffff81089654>] ? mark_held_locks+0x4b/0x6d
 [<ffffffff814f35d0>] ? _raw_spin_unlock_irqrestore+0x45/0x52
 [<ffffffff81089781>] ? trace_hardirqs_on_caller+0x10b/0x12f
 [<ffffffff8133d6a8>] scsi_scan_channel.part.2+0x57/0x72
 [<ffffffff8133d9b2>] scsi_scan_host_selected+0x109/0x191
 [<ffffffff8133daaf>] ? do_scsi_scan_host+0x75/0x75
 [<ffffffff8133daaa>] do_scsi_scan_host+0x70/0x75
 [<ffffffff8133dad2>] do_scan_async+0x23/0x142
 [<ffffffff8133daaf>] ? do_scsi_scan_host+0x75/0x75
 [<ffffffff8133daaf>] ? do_scsi_scan_host+0x75/0x75
 [<ffffffff810745e1>] kthread+0xa8/0xb0
 [<ffffffff814fb324>] kernel_thread_helper+0x4/0x10
 [<ffffffff814f39d4>] ? retint_restore_args+0x13/0x13
 [<ffffffff81074539>] ? __init_kthread_worker+0x5a/0x5a
 [<ffffffff814fb320>] ? gs_change+0x13/0x13

The full dmesg is also attached. The notebook is aspire 5050
Comment 2 Joshua Covington 2011-07-07 18:21:42 UTC
Any chance to get this fixed before v3.0 is out?
Comment 3 Paul E. McKenney 2011-07-08 16:00:52 UTC
I believe that 3181faa85bd (cfq-iosched: fix a rcu warning), which is in mainline, fixes this problem.  Could you please try it out?
Comment 4 Joshua Covington 2011-07-08 17:18:26 UTC
Commit 2a9d6df425d7b46b23cbc8673b2dfefa4678abdb has been merged into master some hours ago. I'll wail till rc7 is out and report back if it fixes the warning.
Comment 5 Joshua Covington 2011-07-10 20:46:12 UTC
I tested this with kernel-3.0-0.rc6.git6.1.fc16.x86_64. Everything is fine now - no rcu warnings. This bug can be closed now (it is already).
Comment 6 Paul E. McKenney 2011-07-10 23:08:52 UTC
Very good, Joshua!  Thank you for testing this!
Comment 7 Christian Casteyde 2011-07-13 18:43:26 UTC
Fixed in 3.0-rc7

Note You need to log in before you can comment on or make changes to this bug.