Bug 13899
Summary: | Oops from tar, 2.6.31-rc5, 32 bit on quad core phenom. | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Rafael J. Wysocki (rjw) |
Component: | i386 | Assignee: | platform_i386 |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | gene.heskett |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.31-rc5 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 13615 |
Description
Rafael J. Wysocki
2009-08-02 21:20:42 UTC
On Monday 03 August 2009, Gene Heskett wrote:
> On Sunday 02 August 2009, Rafael J. Wysocki wrote:
> >This message has been generated automatically as a part of a report
> >of recent regressions.
> >
> >The following bug entry is on the current list of known regressions
> >from 2.6.30. Please verify if it still should be listed and let me know
> >(either way).
> >
> >
> >Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13899
> >Subject : Oops from tar, 2.6.31-rc5, 32 bit on quad core
> phenom.
> >Submitter : Gene Heskett <gene.heskett@verizon.net>
> >Date : 2009-08-01 13:04 (2 days old)
> >References : http://marc.info/?l=linux-kernel&m=124913190304149&w=4
> At this time I am not sure about this one. Amanda ran normally AFAIK this
> morning.
>
> The machine does not appear to be affected, but I did just find this,
> in /var/log/messages: (I think this is the one I posted)
>
> Aug 1 01:15:01 coyote xinetd[2316]: START: amanda pid=18959
> from=::ffff:192.168.71.3
> Aug 1 01:16:17 coyote kernel: [ 7423.221754] BUG: Bad page state in process
> tar pfn:a1293
> Aug 1 01:16:17 coyote kernel: [ 7423.221760] page:c28fc260 flags:80004000
> count:0 mapcount:0 mapping:(null) index:0
> Aug 1 01:16:17 coyote kernel: [ 7423.221764] Pid: 19211, comm: tar Not
> tainted 2.6.31-rc5 #1
> Aug 1 01:16:17 coyote kernel: [ 7423.221766] Call Trace:
> Aug 1 01:16:17 coyote kernel: [ 7423.221774] [<c130b0e3>] ?
> printk+0x23/0x40
> Aug 1 01:16:17 coyote kernel: [ 7423.221780] [<c1083d6f>]
> bad_page+0xcf/0x150
> Aug 1 01:16:17 coyote kernel: [ 7423.221784] [<c1084e0d>]
> get_page_from_freelist+0x37d/0x480
> Aug 1 01:16:17 coyote kernel: [ 7423.221788] [<c107ed54>] ?
> add_to_page_cache_lru+0x84/0x90
> Aug 1 01:16:17 coyote kernel: [ 7423.221791] [<c10850ef>]
> __alloc_pages_nodemask+0xdf/0x520
> Aug 1 01:16:17 coyote kernel: [ 7423.221795] [<c1087954>]
> __do_page_cache_readahead+0x104/0x220
> Aug 1 01:16:17 coyote kernel: [ 7423.221798] [<c1087aa4>]
> ra_submit+0x34/0x50
> Aug 1 01:16:17 coyote kernel: [ 7423.221801] [<c1087d00>]
> ondemand_readahead+0x120/0x240
> Aug 1 01:16:17 coyote kernel: [ 7423.221804] [<c1087ebc>]
> page_cache_async_readahead+0x9c/0xb0
> Aug 1 01:16:17 coyote kernel: [ 7423.221807] [<c10806ac>]
> generic_file_aio_read+0x33c/0x6a0
> Aug 1 01:16:17 coyote kernel: [ 7423.221830] [<c10b09c9>]
> do_sync_read+0xe9/0x140
> Aug 1 01:16:17 coyote kernel: [ 7423.221835] [<c104e400>] ?
> autoremove_wake_function+0x0/0x60
> Aug 1 01:16:17 coyote kernel: [ 7423.221839] [<c115211e>] ?
> security_file_permission+0x1e/0x40
> Aug 1 01:16:17 coyote kernel: [ 7423.221842] [<c10b0a80>] ?
> rw_verify_area+0x60/0xe0
> Aug 1 01:16:17 coyote kernel: [ 7423.221845] [<c10b1827>]
> vfs_read+0xb7/0x180
> Aug 1 01:16:17 coyote kernel: [ 7423.221848] [<c10b08e0>] ?
> do_sync_read+0x0/0x140
> Aug 1 01:16:17 coyote kernel: [ 7423.221850] [<c10b19f8>]
> sys_read+0x58/0xa0
> Aug 1 01:16:17 coyote kernel: [ 7423.221854] [<c10031b7>]
> sysenter_do_call+0x12/0x22
> Aug 1 01:16:17 coyote kernel: [ 7423.221856] Disabling lock debugging due to
> kernel taint
> Aug 1 01:20:31 coyote xinetd[2316]: EXIT: amanda status=0 pid=18959
> duration=330(sec)
> Aug 1 01:20:34 coyote xinetd[2316]: START: amanda pid=19523
> from=::ffff:192.168.71.3
>
> [snip extraneous stuff from my router that goes to syslog]
>
> Aug 1 02:45:05 coyote xinetd[2316]: EXIT: amanda status=0 pid=19523
> duration=5071(sec)
> Aug 1 02:45:07 coyote xinetd[2316]: START: amanda pid=21465
> from=::ffff:192.168.71.3
> Aug 1 02:46:07 coyote xinetd[2316]: EXIT: amanda status=0 pid=21465
> duration=60(sec)
> Aug 1 02:46:19 coyote xinetd[2316]: START: amanda pid=21545
> from=::ffff:192.168.71.3
> Aug 1 02:46:49 coyote xinetd[2316]: EXIT: amanda status=0 pid=21545
> duration=30(sec)
> Aug 1 02:46:50 coyote xinetd[2316]: START: amanda pid=21572
> from=::ffff:192.168.71.3
> Aug 1 02:47:20 coyote xinetd[2316]: EXIT: amanda status=0 pid=21572
> duration=30(sec)
> Aug 1 02:47:21 coyote xinetd[2316]: START: amanda pid=21604
> from=::ffff:192.168.71.3
> Aug 1 02:48:51 coyote xinetd[2316]: EXIT: amanda status=0 pid=21604
> duration=90(sec)
> [then this morning]
> Aug 2 01:36:10 coyote xinetd[2316]: EXIT: amanda status=0 pid=11552
> duration=540(sec)
> Aug 2 01:36:11 coyote xinetd[2316]: START: amanda pid=11801
> from=::ffff:192.168.71.3
> Aug 2 01:54:41 coyote xinetd[2316]: EXIT: amanda status=0 pid=11801
> duration=1110(sec)
> Aug 2 01:54:46 coyote xinetd[2316]: START: amanda pid=12247
> from=::ffff:192.168.71.3
> Aug 2 01:57:16 coyote xinetd[2316]: EXIT: amanda status=0 pid=12247
> duration=150(sec)
> Aug 2 01:57:30 coyote xinetd[2316]: START: amanda pid=12360
> from=::ffff:192.168.71.3
> Aug 2 01:58:00 coyote xinetd[2316]: EXIT: amanda status=0 pid=12360
> duration=30(sec)
> Aug 2 01:58:00 coyote xinetd[2316]: START: amanda pid=12387
> from=::ffff:192.168.71.3
> Aug 2 01:58:30 coyote xinetd[2316]: EXIT: amanda status=0 pid=12387
> duration=30(sec)
> Aug 2 01:58:30 coyote xinetd[2316]: START: amanda pid=12414
> from=::ffff:192.168.71.3
> Aug 2 01:59:30 coyote xinetd[2316]: EXIT: amanda status=0 pid=12414
> duration=60(sec)
> Aug 2 01:59:45 coyote xinetd[2316]: START: amanda pid=12482
> from=::ffff:192.168.71.3
> Aug 2 02:00:15 coyote xinetd[2316]: EXIT: amanda status=0 pid=12482
> duration=30(sec)
> Aug 2 02:00:30 coyote xinetd[2316]: START: amanda pid=12535
> from=::ffff:192.168.71.3
> Aug 2 02:01:00 coyote xinetd[2316]: EXIT: amanda status=0 pid=12535
> duration=30(sec)
>
> So it did not repeat the Ooops this morning. I wish I had more clues to
> offer.
> Let it ride as is for another 4 or 5 days, during which time I may have a
> good
> excuse to reboot & restart this -rc5 from scratch. Or maybe someone else
> might also
> note it, but if not watching the log, I think its otherwise silent & may go
> by
> un-noticed.
I had to reboot last evening, to 2.6.31-rc5 again as yumex updated some stuff and took me down eventually. The Oopsen would appear to be an initial run of tar per boot only thing as I had it again, 78 seconds after amanda sends tar off for a level 0 estimate run. From messages: Aug 4 01:15:02 coyote xinetd[2246]: START: amanda pid=5157 from=::ffff:192.168.71.3 Aug 4 01:16:20 coyote kernel: [ 2243.610865] BUG: Bad page state in process tar pfn:a1193 Aug 4 01:16:20 coyote kernel: [ 2243.610870] page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0 Aug 4 01:16:20 coyote kernel: [ 2243.610875] Pid: 5390, comm: tar Not tainted 2.6.31-rc5 #2 Aug 4 01:16:20 coyote kernel: [ 2243.610877] Call Trace: Aug 4 01:16:20 coyote kernel: [ 2243.610884] [<c130cc83>] ? printk+0x23/0x40 Aug 4 01:16:20 coyote kernel: [ 2243.610890] [<c1083d6f>] bad_page+0xcf/0x150 Aug 4 01:16:20 coyote kernel: [ 2243.610893] [<c1084e0d>] get_page_from_freelist+0x37d/0x480 Aug 4 01:16:20 coyote kernel: [ 2243.610898] [<c107ed54>] ? add_to_page_cache_lru+0x84/0x90 Aug 4 01:16:20 coyote kernel: [ 2243.610901] [<c10850ef>] __alloc_pages_nodemask+0xdf/0x520 Aug 4 01:16:20 coyote kernel: [ 2243.610905] [<c1087954>] __do_page_cache_readahead+0x104/0x220 Aug 4 01:16:20 coyote kernel: [ 2243.610908] [<c1087aa4>] ra_submit+0x34/0x50 Aug 4 01:16:20 coyote kernel: [ 2243.610911] [<c1087d00>] ondemand_readahead+0x120/0x240 Aug 4 01:16:20 coyote kernel: [ 2243.610914] [<c1087ebc>] page_cache_async_readahead+0x9c/0xb0 Aug 4 01:16:20 coyote kernel: [ 2243.610917] [<c10806ac>] generic_file_aio_read+0x33c/0x6a0 Aug 4 01:16:20 coyote kernel: [ 2243.610940] [<c10b09c9>] do_sync_read+0xe9/0x140 Aug 4 01:16:20 coyote kernel: [ 2243.610944] [<c104e400>] ? autoremove_wake_function+0x0/0x60 Aug 4 01:16:20 coyote kernel: [ 2243.610948] [<c115211e>] ? security_file_permission+0x1e/0x40 Aug 4 01:16:20 coyote kernel: [ 2243.610951] [<c10b0a80>] ? rw_verify_area+0x60/0xe0 Aug 4 01:16:20 coyote kernel: [ 2243.610954] [<c10b1827>] vfs_read+0xb7/0x180 Aug 4 01:16:20 coyote kernel: [ 2243.610957] [<c10b08e0>] ? do_sync_read+0x0/0x140 Aug 4 01:16:20 coyote kernel: [ 2243.610960] [<c10b19f8>] sys_read+0x58/0xa0 Aug 4 01:16:20 coyote kernel: [ 2243.610963] [<c10031b7>] sysenter_do_call+0x12/0x22 Aug 4 01:16:20 coyote kernel: [ 2243.610965] Disabling lock debugging due to kernel taint Aug 4 01:22:02 coyote xinetd[2246]: EXIT: amanda status=0 pid=5157 duration=420(sec) Aug 4 01:22:05 coyote xinetd[2246]: START: amanda pid=5762 from=::ffff:192.168.71.3 Aug 4 02:49:36 coyote xinetd[2246]: EXIT: amanda status=0 pid=5762 duration=5251(sec) Aug 4 02:49:49 coyote xinetd[2246]: START: amanda pid=8101 from=::ffff:192.168.71.3 Aug 4 02:50:19 coyote xinetd[2246]: EXIT: amanda status=0 pid=8101 duration=30(sec) Aug 4 02:50:19 coyote xinetd[2246]: START: amanda pid=8142 from=::ffff:192.168.71.3 Aug 4 02:55:49 coyote xinetd[2246]: EXIT: amanda status=0 pid=8142 duration=330(sec) Aug 4 02:55:57 coyote xinetd[2246]: START: amanda pid=23419 from=::ffff:192.168.71.3 Aug 4 02:56:27 coyote xinetd[2246]: EXIT: amanda status=0 pid=23419 duration=30(sec) It may be that after this one, the lock disable prevents reporting subsequent instances? Neither htop, nor slabtop have any interesting showings this morning. Thanks. In reviewing the data above, I should note that this is a quad core phenom, running in 32 bit mode, and the kernel is built with that option in a make xconfig. I believe that is the i686/k8 option? With full pre-emption enabled. On Monday 10 August 2009, Gene Heskett wrote:
> On Sunday 09 August 2009, Rafael J. Wysocki wrote:
> >This message has been generated automatically as a part of a report
> >of recent regressions.
> >
> >The following bug entry is on the current list of known regressions
> >from 2.6.30. Please verify if it still should be listed and let me know
> >(either way).
> >
> >
> >Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13899
> >Subject : Oops from tar, 2.6.31-rc5, 32 bit on quad core
> phenom.
> >Submitter : Gene Heskett <gene.heskett@verizon.net>
> >Date : 2009-08-01 13:04 (9 days old)
> >References : http://marc.info/?l=linux-kernel&m=124913190304149&w=4
>
> Yes, it is still doing it to rc5 but only on the first invocation after the
> bootup. Strangely, it apparently has no other effect, either on the machine,
> or how amanda itself works.
On Thursday 20 August 2009, Gene Heskett wrote:
> On Wednesday 19 August 2009, Rafael J. Wysocki wrote:
> >This message has been generated automatically as a part of a report
> >of recent regressions.
> >
> >The following bug entry is on the current list of known regressions
> >from 2.6.30. Please verify if it still should be listed and let me know
> >(either way).
> >
> >
> >Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13899
> >Subject : Oops from tar, 2.6.31-rc5, 32 bit on quad core
> phenom.
> >Submitter : Gene Heskett <gene.heskett@verizon.net>
> >Date : 2009-08-01 13:04 (19 days old)
> >References : http://marc.info/?l=linux-kernel&m=124913190304149&w=4
>
> I have not seen this since I rebooted to 31-rc6 3.5 days ago. So apparently
> something that went into rc6 that was not in rc5, has fixed it. I have had
> one kmail crash, but IIRC that was after yum had updated something. I
> believe this particular tar related item has been attended to. Close it
> out.
|