Bug 15581

Summary: Kernel panic - not syncing: hung_task: blocked tasks
Product: Other Reporter: Krzysztof Mościcki (stivi)
Component: OtherAssignee: other_other
Status: RESOLVED OBSOLETE    
Severity: high CC: alan
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 2.6.33.1 Subsystem:
Regression: No Bisected commit-id:

Description Krzysztof Mościcki 2010-03-19 11:41:39 UTC
Distribution:Debian Lenny

Hardware Environment:
INTEL Server Board S5520HC
Intel(R) Xeon(R) CPU X5560  @ 2.80GHz
2x RAID bus controller 3ware Inc 9690SA-8I

Software Environment: squid (multi instances)

Problem Description:

Following four logs from kernel panics with two days:


- log nr 1:
[  373.605355] INFO: task jbd2/sdc1-8:3412 blocked for more than 120 seconds.
[  373.687616] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  373.781339] jbd2/sdc1-8   D ffff88066f1120c0     0  3412      2 0x00000000
[  373.864089]  ffff88066f2e4690 0000000000000046 0000000000000000 ffff880662da1c60
[  373.953452]  000000000000dd48 ffff880662da1fd8 0000000000012800 0000000000012800
[  374.043020]  0000000200000286 ffff880662da1e08 ffff88066fa10050 ffff88066f2e4920
[  374.132352] Call Trace:
[  374.161658]  [<ffffffffa0243f5f>] ? jbd2_journal_commit_transaction+0x1ce/0x1173 [jbd2]
[  374.257434]  [<ffffffff8102995d>] ? dequeue_entity+0x18/0x135
[  374.326263]  [<ffffffff8104c090>] ? autoremove_wake_function+0x0/0x2e
[  374.403338]  [<ffffffff81040d30>] ? lock_timer_base+0x26/0x4b
[  374.472146]  [<ffffffff8104123d>] ? try_to_del_timer_sync+0xa0/0xaa
[  374.547141]  [<ffffffffa0249c96>] ? kjournald2+0xbd/0x1dd [jbd2]
[  374.619057]  [<ffffffff8104c090>] ? autoremove_wake_function+0x0/0x2e
[  374.696118]  [<ffffffffa0249bd9>] ? kjournald2+0x0/0x1dd [jbd2]
[  374.766970]  [<ffffffff8104bc79>] ? kthread+0x79/0x81
[  374.827477]  [<ffffffff81003614>] ? kernel_thread_helper+0x4/0x10
[  374.900430]  [<ffffffff8104bc00>] ? kthread+0x0/0x81
[  374.959902]  [<ffffffff81003610>] ? kernel_thread_helper+0x0/0x10
[  375.032817] Kernel panic - not syncing: hung_task: blocked tasks
[  375.104711] Pid: 415, comm: khungtaskd Not tainted 2.6.33.1-univ #1


- log nr 2:
[  373.604583] INFO: task jbd2/sdc1-8:3456 blocked for more than 120 seconds.
[  373.686892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  373.780621] jbd2/sdc1-8   D ffff88066facf780     0  3456      2 0x00000000
[  373.863291]  ffff88066faa4cd0 0000000000000046 0000800060eaec28 ffff8806618d9c60
[  373.952843]  000000000000dd48 ffff8806618d9fd8 0000000000012800 0000000000012800
[  374.042213]  0000000200000286 ffff8806618d9e08 ffff88066f8b0690 ffff88066faa4f60
[  374.131581] Call Trace:
[  374.161004]  [<ffffffff81258ece>] ? common_interrupt+0xe/0x13
[  374.229899]  [<ffffffffa0240f5f>] ? jbd2_journal_commit_transaction+0x1ce/0x1173 [jbd2]
[  374.325681]  [<ffffffff8102995d>] ? dequeue_entity+0x18/0x135
[  374.394592]  [<ffffffff8104c090>] ? autoremove_wake_function+0x0/0x2e
[  374.471750]  [<ffffffff81040d30>] ? lock_timer_base+0x26/0x4b
[  374.540496]  [<ffffffff8104123d>] ? try_to_del_timer_sync+0xa0/0xaa
[  374.615473]  [<ffffffffa0246c96>] ? kjournald2+0xbd/0x1dd [jbd2]
[  374.687347]  [<ffffffff8104c090>] ? autoremove_wake_function+0x0/0x2e
[  374.764545]  [<ffffffffa0246bd9>] ? kjournald2+0x0/0x1dd [jbd2]
[  374.835403]  [<ffffffff8104bc79>] ? kthread+0x79/0x81
[  374.895949]  [<ffffffff81003614>] ? kernel_thread_helper+0x4/0x10
[  374.968863]  [<ffffffff8104bc00>] ? kthread+0x0/0x81
[  375.028295]  [<ffffffff81003610>] ? kernel_thread_helper+0x0/0x10
[  375.101192] Kernel panic - not syncing: hung_task: blocked tasks
[  375.173046] Pid: 415, comm: khungtaskd Not tainted 2.6.33.1-univ #1


- log nr 3:
[  493.357929] INFO: task jbd2/sdc1-8:3417 blocked for more than 120 seconds.
[  493.440190] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  493.533901] jbd2/sdc1-8   D ffff88066fabd280     0  3417      2 0x00000000
[  493.616561]  ffff88066fbce050 0000000000000046 0000000000000000 ffff8806638f5c90
[  493.705985]  000000000000dd48 ffff8806638f5fd8 0000000000012800 0000000000012800
[  493.795278]  00000002ffffffff ffff8806638f5e08 ffff88066f8fb310 ffff88066fbce2e0
[  493.884606] Call Trace:
[  493.913930]  [<ffffffffa0248f5f>] ? jbd2_journal_commit_transaction+0x1ce/0x1173 [jbd2]
[  494.009729]  [<ffffffff8102995d>] ? dequeue_entity+0x18/0x135
[  494.078513]  [<ffffffff8102a3fe>] ? find_busiest_queue+0xba/0xdb
[  494.150401]  [<ffffffff8104c090>] ? autoremove_wake_function+0x0/0x2e
[  494.227463]  [<ffffffff81040d30>] ? lock_timer_base+0x26/0x4b
[  494.296266]  [<ffffffff8104123d>] ? try_to_del_timer_sync+0xa0/0xaa
[  494.371285]  [<ffffffffa024ec96>] ? kjournald2+0xbd/0x1dd [jbd2]
[  494.443292]  [<ffffffff8104c090>] ? autoremove_wake_function+0x0/0x2e
[  494.520377]  [<ffffffffa024ebd9>] ? kjournald2+0x0/0x1dd [jbd2]
[  494.591212]  [<ffffffff8104bc79>] ? kthread+0x79/0x81
[  494.651796]  [<ffffffff81003614>] ? kernel_thread_helper+0x4/0x10
[  494.724736]  [<ffffffff8104bc00>] ? kthread+0x0/0x81
[  494.784166]  [<ffffffff81003610>] ? kernel_thread_helper+0x0/0x10
[  494.857142] Kernel panic - not syncing: hung_task: blocked tasks
[  494.929119] Pid: 415, comm: khungtaskd Not tainted 2.6.33.1-univ #1
[  494.929120] Call Trace:
[  494.929129]  [<ffffffff812568b7>] ? panic+0x86/0x145
[  494.929134]  [<ffffffff810054a6>] ? show_stack_log_lvl+0xfb/0x10a
[  494.929148]  [<ffffffff8106b61d>] ? watchdog+0x178/0x1ba
[  494.929149]  [<ffffffff8106b4a5>] ? watchdog+0x0/0x1ba
[  494.929153]  [<ffffffff8104bc79>] ? kthread+0x79/0x81
[  494.929159]  [<ffffffff81003614>] ? kernel_thread_helper+0x4/0x10

- log nr 4:
[69766.448212] general protection fault: 0000 [#1] SMP
[69766.507769] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.3/i2c-0/name
[69766.592010] CPU 6
[69766.616147] Pid: 3640, comm: squid Not tainted 2.6.33.1-univ #1 S5520HC/S5520HC
[69766.703525] RIP: 0010:[<ffffffff8120af69>]  [<ffffffff8120af69>] inet_csk_bind_conflict+0x8b/0xa6
[69766.809719] RSP: 0018:ffff880663769e30  EFLAGS: 00010206
[69766.873192] RAX: ffffffff8128cdea RBX: 00000000ffffffea RCX: 0000000000000000
[69766.958473] RDX: ffff880619436d40 RSI: 2c1f165691ec6855 RDI: ffff880086f339c0
[69767.043752] RBP: ffff88066df55c60 R08: 000000003790af53 R09: 2c1f165691ec6855
[69767.129027] R10: ffff88066daf4901 R11: 0000000000000000 R12: ffff88066e40c3a0
[69767.214309] R13: ffff880086f339c0 R14: 00000000ffffffff R15: 0000000000000c3a
[69767.299590] FS:  00007ffc38b8d6e0(0000) GS:ffff880028380000(0000) knlGS:0000000000000000
[69767.396321] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[69767.464990] CR2: 00000000403fddc0 CR3: 0000000663450000 CR4: 00000000000006e0
[69767.550270] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[69767.635551] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[69767.720831] Process squid (pid: 3640, threadinfo ffff880663768000, task ffff88066c685950)
[69767.818595] Stack:
[69767.842617]  ffffffff8120b16c 0000000000000000 0000000000000000 ffffffff8148fe00
[69767.929274] <0> 0000000000000005 0000000000000000 00000000ffffffea ffff880086f339c0
[69768.021365] <0> 0000000000000002 0000000000000c3a ffff880663769ec8 000000003790af53
[69768.115670] Call Trace:
[69768.144886]  [<ffffffff8120b16c>] ? inet_csk_get_port+0x1e8/0x2dc
[69768.217711]  [<ffffffff81229c46>] ? inet_bind+0x103/0x1b0
[69768.282228]  [<ffffffff811d31d1>] ? sys_bind+0x61/0x91
[69768.343629]  [<ffffffff812085ac>] ? ip_setsockopt+0x1c/0x78
[69768.410221]  [<ffffffff811d2cbe>] ? sys_setsockopt+0x8b/0x9d
[69768.477852]  [<ffffffff810028eb>] ? system_call_fastpath+0x16/0x1b
[69768.551712] Code: 0a 75 20 8a 42 1e 3c 06 74 08 8b 82 34 02 00 00 eb 03 8b 42 4c 85 c0 74 24 45 85 c0 74 1f 44 39 c0 74 1a 4c 89 ce 48

Steps to reproduce: Server crashes randomly about once per day.

This is the same machine from bug https://bugzilla.kernel.org/show_bug.cgi?id=15148, but a different version of the kernel.
Comment 1 Krzysztof Mościcki 2010-03-19 11:53:40 UTC
Igb driver is replaced by: http://downloadcenter.intel.com/detail_desc.aspx?agr=Y&DwnldID=13663