Bug 9264

Summary: leds: ledtrig-timer calls sleeping function from invalid context
Product: Drivers Reporter: Márton Németh (nm127)
Component: OtherAssignee: Richard Purdie (rpurdie)
Status: RESOLVED CODE_FIX    
Severity: normal CC: akpm, greg, htejun, mingo, rpurdie
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24-rc1 Subsystem:
Regression: --- Bisected commit-id:
Bug Depends on:    
Bug Blocks: 9243    
Attachments: .config of 2.6.24-rc1
Kernel-space test for DEBUG_SPINLOCK_SLEEP and kzalloc()
change GFP_KERNEL to GFP_ATOMIC
Proposed fix
Kernel-space test for DEBUG_SPINLOCK_SLEEP and kzalloc()

Description Márton Németh 2007-10-30 22:31:05 UTC
Most recent kernel where this bug did not occur: -
Distribution: Debian 4.0
Hardware Environment: Clevo D410J
Software Environment:
Problem Description:

When I activate and deactivate the ledtrig-timer module for a LED I get the following BUG message in dmesg:

When activating:

BUG: sleeping function called from invalid context at mm/slab.c:3050
in_atomic():1, irqs_disabled():0
3 locks held by bash/3557:
 #0:  (&buffer->mutex){--..}, at: [<c01b4f3b>] sysfs_write_file+0x2b/0x100
 #1:  (triggers_list_lock){----}, at: [<c0296efa>] led_trigger_store+0x7a/0x110
 #2:  (&led_cdev->trigger_lock){--..}, at: [<c0296f5c>] led_trigger_store+0xdc/0x110
 [<c010533a>] show_trace_log_lvl+0x1a/0x30
 [<c0105d32>] show_trace+0x12/0x20
 [<c0105e85>] dump_stack+0x15/0x20
 [<c011a478>] __might_sleep+0xb8/0xd0
 [<c01749cf>] kmem_cache_alloc+0xcf/0x120
 [<c01b5ac4>] sysfs_new_dirent+0x34/0x100
 [<c01b52cf>] sysfs_add_file+0x2f/0x90
 [<c01b535c>] sysfs_create_file+0x2c/0x40
 [<c0278efd>] device_create_file+0x2d/0x50
 [<f8d0314a>] timer_trig_activate+0x4a/0xa0 [ledtrig_timer]
 [<c0296a09>] led_trigger_set+0xa9/0xd0
 [<c0296f65>] led_trigger_store+0xe5/0x110
 [<c027839f>] dev_attr_store+0x2f/0x40
 [<c01b4fb2>] sysfs_write_file+0xa2/0x100
 [<c0177ff9>] vfs_write+0x99/0x130
 [<c01785dd>] sys_write+0x3d/0x70
 [<c01042d2>] sysenter_past_esp+0x5f/0xa5
 =======================

When deactivating:

BUG: sleeping function called from invalid context at kernel/mutex.c:207
in_atomic():1, irqs_disabled():0
2 locks held by bash/3557:
 #0:  (&buffer->mutex){--..}, at: [<c01b4f3b>] sysfs_write_file+0x2b/0x100
 #1:  (&led_cdev->trigger_lock){--..}, at: [<c0296ed5>] led_trigger_store+0x55/0x110
 [<c010533a>] show_trace_log_lvl+0x1a/0x30
 [<c0105d32>] show_trace+0x12/0x20
 [<c0105e85>] dump_stack+0x15/0x20
 [<c011a478>] __might_sleep+0xb8/0xd0
 [<c030306d>] mutex_lock_nested+0x1d/0x2f0
 [<c01b5a1c>] sysfs_addrm_start+0x2c/0xa0
 [<c01b4808>] sysfs_hash_and_remove+0x28/0x70
 [<c01b523d>] sysfs_remove_file+0xd/0x10
 [<c0278ab7>] device_remove_file+0x27/0x40
 [<f8d03023>] timer_trig_deactivate+0x23/0x50 [ledtrig_timer]
 [<c02969ba>] led_trigger_set+0x5a/0xd0
 [<c0296ede>] led_trigger_store+0x5e/0x110
 [<c027839f>] dev_attr_store+0x2f/0x40
 [<c01b4fb2>] sysfs_write_file+0xa2/0x100
 [<c0177ff9>] vfs_write+0x99/0x130
 [<c01785dd>] sys_write+0x3d/0x70
 [<c01042d2>] sysenter_past_esp+0x5f/0xa5


Steps to reproduce:
# modprobe -k ledtrig-timer
# modprobe -k leds-clevo-mail
# echo timer >/sys/class/leds/clevo::mail/trigger 
# echo none >/sys/class/leds/clevo::mail/trigger
Comment 1 Márton Németh 2007-10-30 22:32:56 UTC
Created attachment 13357 [details]
.config of 2.6.24-rc1
Comment 2 Andrew Morton 2007-11-08 09:14:25 UTC
did this happen in earlier kernels, or is it new behaviour in 2.6.24-rc1?
Comment 3 Tejun Heo 2007-11-08 17:58:09 UTC
Looks like a bug in the led driver.  led_trigger_store() calls led_trigger_set() with led_cdev->trigger_lock write locked.  led_trigger_set() in turn calls timer_trig_activate() which in turn calls kzalloc with GFP_KERNEL and then device_create_file() which eventually calls sysfs_new_dirent() which triggers BUG as it tries GFP_KERNEL allocation while a rwlock_t is held.

The question is why the kzalloc() call from led_trigger_set() didn't cause the BUG first.  Anyways, that lock should either be a rwsem or led_trigger_set() shouldn't try to allocate using GFP_KERNEL or call device_create_file().
Comment 4 Márton Németh 2007-11-09 23:43:01 UTC
Created attachment 13491 [details]
Kernel-space test for DEBUG_SPINLOCK_SLEEP and kzalloc()

The attached test tries to trigger the BUG message when kzalloc() is used badly.

TestCase1: first calls write_lock() then calls kzalloc(..., GFP_ATOMIC) and finally write_unlock(). The BUG message should not appear.

TestCase2: calls write_lock() then calls kzalloc(..., GFP_KERNEL) and finally write_unlock(). The BUG message should appear.

Test loaded
TestCase1: Try to kzalloc(..., GFP_ATOMIC) while a write lock is held
TestCase1: in_atomic():1, irqs_disabled():0
TestCase1: finished
TestCase2: Try to kzalloc(..., GFP_KERNEL) while a write lock is held
TestCase2: in_atomic():1, irqs_disabled():0
BUG: sleeping function called from invalid context at mm/slab.c:3050
in_atomic():1, irqs_disabled():0
1 lock held by insmod/10900:
 #0:  (&test_lock#2){--..}, at: [<f8cca0b8>] test_init+0xb8/0x11a [spinlock_sleep_test]
 [<c010533a>] show_trace_log_lvl+0x1a/0x30
 [<c0105d32>] show_trace+0x12/0x20
 [<c0105e85>] dump_stack+0x15/0x20
 [<c011a478>] __might_sleep+0xb8/0xd0
 [<c01749cf>] kmem_cache_alloc+0xcf/0x120
 [<f8cca0fa>] test_init+0xfa/0x11a [spinlock_sleep_test]
 [<c01470e1>] sys_init_module+0x131/0x15a0
 [<c01042d2>] sysenter_past_esp+0x5f/0xa5
 =======================
TestCase2: finished
Test unloading

It seems that the BUG detection works correctly.
Comment 5 Márton Németh 2007-11-09 23:57:13 UTC
Created attachment 13492 [details]
change GFP_KERNEL to GFP_ATOMIC

I'm afraid I was not carefully enough to report this issue using an unmodified kernel. I found that I already changed the GFP_KERNEL to GFP_ATOMIC in drivers/leds/ledtrig-timer.c, sorry about that.

However, this change did not removed the BUG message in dmesg.
Comment 6 Richard Purdie 2007-11-10 03:53:52 UTC
Looking at the code two of the locks should be rwsems. I'll write a patch...
Comment 7 Richard Purdie 2007-11-10 04:51:12 UTC
Created attachment 13493 [details]
Proposed fix

The attached patch should fix the problem, can you test and confirm please?
Comment 8 Márton Németh 2007-11-10 22:35:02 UTC
Created attachment 13495 [details]
Kernel-space test for DEBUG_SPINLOCK_SLEEP and kzalloc()

TestCase3 added to test init_rwsem()/down_write()/up_write() concept. It seems to be working:

TestCase3: Try to kzalloc(..., GFP_KERNEL) while a write semaphore is held
TestCase3: in_atomic():0, irqs_disabled():0
TestCase3: finished

I'll try the patch from comment #7 soon.
Comment 9 Márton Németh 2007-11-10 23:20:19 UTC
The patch from comment #7 gets rid of the BUG messages, thanks.
Comment 10 Ingo Molnar 2007-12-04 02:18:51 UTC
Ping: the patch in the bugzilla solved the regression but the fix is still not upstream it appears, as of 2.6.24-rc4.
Comment 11 Richard Purdie 2007-12-07 01:02:56 UTC
The patch has been in the LED tree and -mm, I'll make sure it gets to Linus.