Bug 42645 - kernel BUG at fs/ocfs2/file.c:756
Summary: kernel BUG at fs/ocfs2/file.c:756
Status: ASSIGNED
Alias: None
Product: File System
Classification: Unclassified
Component: ocfs2 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: fs_ocfs2
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-23 23:43 UTC by Sonni
Modified: 2013-12-23 14:25 UTC (History)
2 users (show)

See Also:
Kernel Version: 3.1.6
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Sonni 2012-01-23 23:43:22 UTC
Setup:
Server providing iscsi device with ocfs2 to it self and others, and exporting the local mounted ocfs2 filesystem via NFS

disks -> md mirror -> lvm -> iscsi target (tgt) -> iscsi via loopback -> ocfs2 mount -> nfs export

It appears that sometimes when my diskless NFS client (root nfs is on raiserfs) is using some of the ocfs2 partitions via NFS below error occurs.

NFS client is a PVR and was at this time was writing/recording something to the following file:
4295946240 23 jan 18:42 4500_20120123173900.mpg

To a partition with size=793Gb and free=646Gb

Dmesg:

kernel BUG at fs/ocfs2/file.c:756!
invalid opcode: 0000 [#1] SMP 
Modules linked in: nfnetlink_queue af_packet cls_fw cls_u32 sch_tbf sch_prio sch_htb sch_hfsc sch_sfq xt_time xt_
connlimit xt_realm iptable_raw xt_comment xt_recent xt_policy ipt_ULOG ipt_REJECT ipt_REDIRECT ipt_NETMAP ipt_MAS
QUERADE ipt_ECN ipt_ecn ipt_ah nf_nat_tftp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_
ftp nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntr
ack_netlink nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_tcpmss xt_pkttype xt_physdev xt_owner xt_NFQUE
UE xt_NFLOG nfnetlink_log xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSC
P xt_dscp xt_dccp xt_conntrack xt_connmark xt_CLASSIFY ipt_LOG xt_tcpudp xt_state iptable_nat nf_nat nf_conntrack
_ipv4 nf_defrag_ipv4 nf_conntrack nfnetlink iptable_filter macvlan veth ocfs2_stack_o2cb nfsd lockd auth_rpcgss e
xportfs sunrpc smsc47m192 hwmon_vid smsc47m1 coretemp iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge
 ipv6 stp llc ocfs2_dlmfs ocfs2_dlm ocfs2 jbd2 ocfs2_nodemanager ocfs2_stackglue quota_tree dlm configfs crc32c t
un iptable_mangle ip_tables x_tables dummy loop usb_storage evdev ub processor 3c59x i2c_i801 pcspkr serio_raw i2
c_core uhci_hcd intel_agp intel_gtt ehci_hcd sg button floppy thermal_sys agpgart unix

Pid: 2696, comm: nfsd Not tainted 3.0.6-gentoo #4                  /D945GCLF2
EIP: 0060:[<f851e015>] EFLAGS: 00010297 CPU: 0
EIP is at ocfs2_zero_extend+0x4e5/0xc80 [ocfs2]
EAX: 00000001 EBX: 00000000 ECX: 000f0000 EDX: 00000001
ESI: 000f0000 EDI: f08ac9e0 EBP: f08ac90c ESP: f0c45a50
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process nfsd (pid: 2696, ti=f0c44000 task=f5280040 task.ti=f0c44000)
Stack:
 f0c45ad0 f0c45acc ca4a2738 f1683896 ca4a2750 00000000 f91d9cce ca4a2738
 f91d84f0 00010013 000ef000 00000001 000ef000 00000001 0001000e 000f0000
 00000000 f95a40cf 2b99528e 00000001 c1007459 000f0000 000ef000 00000000
Call Trace:
 [<f91d9cce>] ? br_dev_queue_push_xmit+0x4e/0x70 [bridge]
 [<f91d84f0>] ? br_dev_xmit+0x100/0x160 [bridge]
 [<f95a40cf>] ? tcp_packet+0x8ff/0xfb0 [nf_conntrack]
 [<c1007459>] ? nommu_map_page+0x39/0x70
 [<c124ab5f>] ? rtl8169_start_xmit+0xcf/0x5e0
 [<f85017e9>] ? ocfs2_write_begin_nolock+0x4b9/0x2410 [ocfs2]
 [<f85526cf>] ? ocfs2_metadata_cache_owner+0xf/0x20 [ocfs2]
 [<f85526ad>] ? ocfs2_metadata_cache_unlock+0xd/0x20 [ocfs2]
 [<f85528c5>] ? ocfs2_buffer_cached.clone.15+0xa5/0x100 [ocfs2]
 [<c1343e50>] ? __mutex_lock_slowpath+0x170/0x210
 [<f8504ca9>] ? ocfs2_read_blocks+0x259/0x4d0 [ocfs2]
 [<f8523d1e>] ? ocfs2_read_inode_block_full+0x3e/0x60 [ocfs2]
 [<f8521700>] ? ocfs2_find_actor+0xe0/0xe0 [ocfs2]
 [<f85157c5>] ? ocfs2_inode_lock_full_nested+0x4f5/0xb60 [ocfs2]
 [<f8527f6c>] ? ocfs2_wait_for_recovery+0x1c/0xa0 [ocfs2]
 [<f8503829>] ? ocfs2_write_begin+0xe9/0x210 [ocfs2]
 [<c107d6c3>] ? generic_file_buffered_write+0xe3/0x210
 [<f852124a>] ? ocfs2_file_aio_write+0x74a/0x7a0 [ocfs2]
 [<f851b1e0>] ? ocfs2_file_aio_read+0x2d0/0x2d0 [ocfs2]
 [<f92de67a>] ? exportfs_decode_fh+0x8a/0x1f8 [exportfs]
 [<f8520b00>] ? ocfs2_file_splice_write+0x2a0/0x2a0 [ocfs2]
 [<c10aeff6>] ? do_sync_readv_writev+0xb6/0xf0
 [<c10ae73a>] ? rw_verify_area+0x6a/0x130
 [<c10af23a>] ? do_readv_writev+0xaa/0x1a0
 [<f8520b00>] ? ocfs2_file_splice_write+0x2a0/0x2a0 [ocfs2]
 [<c118373b>] ? snprintf+0x1b/0x20
 [<f8510699>] ? ocfs2_build_lock_name+0x49/0xb0 [ocfs2]
 [<f850fc8b>] ? ocfs2_add_lockres_tracking+0x2b/0xb0 [ocfs2]
 [<c10af380>] ? vfs_writev+0x50/0x60
 [<f936ce62>] ? nfsd_vfs_write.clone.9+0x92/0x2f0 [nfsd]
 [<c10ace9f>] ? dentry_open+0x3f/0x80
 [<f851b500>] ? ocfs2_dir_open+0x10/0x10 [ocfs2]
 [<f936db15>] ? nfsd_open+0xe5/0x190 [nfsd]
 [<f936f121>] ? nfsd_write+0x111/0x130 [nfsd]
 [<f9374bf1>] ? nfsd3_proc_write+0xb1/0x150 [nfsd]
 [<f9375fce>] ? nfs3svc_decode_writeargs+0xde/0x140 [nfsd]
 [<f9369831>] ? nfsd_dispatch+0xd1/0x200 [nfsd]
 [<f92b00f3>] ? svc_authenticate+0xa3/0xc0 [sunrpc]
 [<f92aca67>] ? svc_process+0x417/0x760 [sunrpc]
 [<f93690dc>] ? nfsd+0xac/0x140 [nfsd]
 [<c1024660>] ? complete+0x40/0x60
 [<f9369030>] ? nfsd_shutdown+0x30/0x30 [nfsd]
 [<c104abc4>] ? kthread+0x74/0x80
 [<c104ab50>] ? kthread_worker_fn+0x140/0x140
 [<c13459b6>] ? kernel_thread_helper+0x6/0xd
Code: 00 00 83 d3 00 39 df 89 74 24 3c 72 0e 77 04 39 ce 76 08 89 4c 24 3c 89 5c 24 40 8b 44 24 2c 39 44 24 40 8b
 bd d0 00 00 00 73 03 <0f> 0b 90 77 0a 8b 54 24 28 39 54 24 3c 76 f1 8b 4c 24 2c 31 db 
EIP: [<f851e015>] ocfs2_zero_extend+0x4e5/0xc80 [ocfs2] SS:ESP 0068:f0c45a50
---[ end trace ffcb6b14f9600ed7 ]---

syslog:
Jan 23 18:42:14 xeno a2738 f1683896 ca4a2750 00000000 f91d9cce ca4a2738
Jan 23 18:42:14 xeno kernel: f91d84f0 00010013 000ef000 00000001 000ef000 00000001 0001000e 000f0000
Jan 23 18:42:14 xeno kernel: 00000000 f95a40cf 2b99528e 00000001 c10075 0f0000f0 0000<>alTae
Jan 23 18:42:14 xeno 4 <9dce]? br_e_uu_uhxi+xe07 big]<>[f18f>  rdvxi+x0/x6 big]<>[f54c>  c_akt08f0f0[fcntak
Jan 23 18:42:14 xeno 4 <1049]?nmumppg+x907
Jan 23 18:42:14 xeno 4 <14bf]?rl19satx+xf050<>[f51e>  cs_rt_ei_ook049021 of2
Jan 23 18:42:14 xeno 4 <856f]?of2mtdt_ah_we+x/0[cs]<>[f52a>  cs_eaaacceulc+x/x0[cs]<>[f52c>  cs_ufrcce.ln.50a/x0 
of2
Jan 23 18:42:14 xeno 4 <14e0]?_mtxlc_lwah010020<>[f54a>  cs_edbok+x5/xd of2
Jan 23 18:42:14 xeno 4 <82de]?of2ra_nd_lc_ul03/x0[cs]<>[f510>  cs_idatr0e/x0[cs]<>[f55c>  cs_nd_okfl_etd0450b0[cs
]<>[f576>  cs_atfrrcvr+xc0a of2
Jan 23 18:42:14 xeno 4 <8089]?of2wiebgn0e/x1 of2
Jan 23 18:42:14 xeno 4 <1763]?gnrcfl_ufrdwie0e/x1
Jan 23 18:42:14 xeno 4 <822a]?of2fl_i_rt+x4/xa of2
Jan 23 18:42:14 xeno 4 <8110]?of2fl_i_ed020020[cs]<>[f2e7>  xotsdcd_h08/xf eprf]<>[f500>  cs_ieslc_rt+xa/xa of2
Jan 23 18:42:14 xeno 4 <1af6]?d_ycravwie+x60f
Jan 23 18:42:14 xeno 4 <1a7a]?r_eiyae+xa010<>[c0f3>  oravwie+xa010<>[f500>  cs_ieslc_rt+xa/xa of2
Jan 23 18:42:14 xeno 4 <187b]?spit+xb02
Jan 23 18:42:14 xeno 4 <8169]?of2bidlc_ae04/x0[cs]<>[f5f8>  cs_d_oke_rcig02/x0[cs]<>[c0f8>  f_rtv05/x0<>[f3c6>  f
dvswiecoe909/xf ns]<>[c0c9>  etyoe+xf08
Jan 23 18:42:14 xeno 4 <8150]?of2droe+x001 of2
Jan 23 18:42:14 xeno 4 <96b5]?ns_pn0e/x9 ns]<>[f3f2>  fdwie011010[fd
Jan 23 18:42:14 xeno 4 <97b1]?ns3po_rt+x1010[fd
Jan 23 18:42:14 xeno 4 <97fe]?nsscdcd_rtag+xe010[fd
Jan 23 18:42:14 xeno 4 <9681]?ns_ipth0d/x0 ns]<>[f20f>  v_uhniae0a/x0[urc
Jan 23 18:42:14 xeno 4 <9aa7]?scpoes047070[urc
Jan 23 18:42:14 xeno 4 <960c]?ns+xc010[fd
Jan 23 18:42:14 xeno 4 <1260]?cmlt+x006
Jan 23 18:42:14 xeno 4 <9600]?ns_hton03/x0[fd
Jan 23 18:42:14 xeno 4 <14b4]?khed07/x0<>[c0a5>  tra_okrf+x4/x4
Jan 23 18:42:14 xeno 4 <1496]?kre_hedhle+x/x
Jan 23 18:42:14 xeno 0Cd:0 08 30 9d 97 43 20 70 9c 60 94 43 95 44 b4 42 94 44 bb 00 00 30 0>0 07 a8 42 83 42 c7 1
8 c2 c3 b
Jan 23 18:42:14 xeno 0EP <8105]of2zr_xed0450c0[cs]S:S 08fc55
Jan 23 18:42:14 xeno 4-- n rc fbb490e7]-


Kernel version:
Linux xeno 3.0.6-gentoo #4 SMP Sun Nov 20 10:30:33 CET 2011 i686 Intel(R) Atom(TM) CPU 330 @ 1.60GHz GenuineIntel GNU/Linux
 
Gnu C                  4.5.3
Gnu make               3.82
binutils               2.21.1
util-linux             2.19.1
mount                  support
module-init-tools      3.16
e2fsprogs              1.41.14
reiserfsprogs          3.6.21
xfsprogs               3.1.4
Linux C Library        2.13
Dynamic linker (ldd)   2.13
Procps                 3.2.8
Net-tools              1.60_p20110409135728
Kbd                    1.15.3wip
Sh-utils               8.7
Modules Loaded         nfnetlink_queue af_packet cls_fw cls_u32 sch_tbf sch_prio sch_htb sch_hfsc sch_sfq xt_time xt_connlimit xt_realm iptable_raw xt_comment xt_recent xt_policy ipt_ULOG ipt_REJECT ipt_REDIRECT ipt_NETMAP ipt_MASQUERADE ipt_ECN ipt_ecn ipt_ah nf_nat_tftp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_tcpmss xt_pkttype xt_physdev xt_owner xt_NFQUEUE xt_NFLOG nfnetlink_log xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp xt_conntrack xt_connmark xt_CLASSIFY ipt_LOG xt_tcpudp xt_state iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack nfnetlink iptable_filter macvlan veth ocfs2_stack_o2cb nfsd lockd auth_rpcgss exportfs sunrpc smsc47m192 hwmon_vid smsc47m1 coretemp iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge ipv6 stp llc ocfs2_dlmfs ocfs2_dlm ocfs2 jbd2 ocfs2_nodemanager ocfs2_stackglue quota_tree dlm configfs crc32c tun iptable_mangle ip_tables x_tables dummy loop usb_storage evdev ub processor 3c59x i2c_i801 pcspkr serio_raw i2c_core uhci_hcd intel_agp intel_gtt ehci_hcd sg button floppy thermal_sys agpgart unix


It looks like it has to do with the size of the file. Going back in the logs I find that when it was writing to this file, it also failed with below trace.

4295299072  1 jan 22:27 1005_20120101211300.mpg

Jan  1 22:27:13 xeno >------ u ee]------<>enlBGa sof2fl.:5!<>nai poe 00[1 M <>oue ikdi:nntikqeea_aktcsf l_3 c_b c
_roshhbshhs c_f ttm tcnlmtx_el pal_a tcmetx_eetx_oiyitUO p_EETitRDRC p_EMPitMSUREpE tci_ _tf _tinn_tnn_o_efai _t3
 _ttncna_tncna_pfotcptspfotcppfotcptg _nrkei _nrkrncna_2ncna_ptcsxptethd _n _QUxNOnei_gtuirxmktaxlixlg _rgxhp _si
ttS _cxdptotcxcnrxCSFi_Gtcdxstiaeann _nrkp _fgp _nrkftniaeiemvne f_a_cndodu_csxrsupscm2wniscmcemist bc_pisic_apts
irgi6tl f_m f_mc2b f_dag f_ageuar moisr2t tlmg _bs_bsuyo bta  d ch osreoai_039iea t_tga oy2ceckstrlyeicbt i<
Jan  1 22:27:13 xeno >d30cmndotnd.6eo#      D5L
Jan  1 22:27:13 xeno >P00<40>EA:007P <E   f_ree+4/c c2<E:001B 00 X050E:0014S 00 Ie26E:60cS ea
Jan  1 22:27:13 xeno >S0bS0bS08S00S080re s(d30te30tk190a.=e0)0tk< ea ea 4d 78 4d 00 1c 4d
Jan  1 22:27:13 xeno >9800060100010100010000204000000ce9000e83050050000<Clre< fbc] _vueu_i0ex rg
Jan  1 22:27:13 xeno ><14>?rextx0x0bd]4[170 ifi_tt0505< fee] f_i_g_lkx9x1[f]4[90b nn_tfe0bx fa
Jan  1 22:27:13 xeno ><56>?c2edaaewrx00os
Jan  1 22:27:13 xeno ><56>?c2edaaenc0/2[f]4[825 osbf_cdle5x/1 c2< c45] melklphx0x04[849 osrdlk050d[f]4[83e osrdne
lku+300os
Jan  1 22:27:13 xeno ><57>?c2i_t+e00os
Jan  1 22:27:13 xeno ><47>?c2neo_lnt+4/b c2< f06] f_if_ce+100os
Jan  1 22:27:13 xeno ><48>?c2reenx/2 c2< c7c] nifeuedrex/2
Jan  1 22:27:13 xeno ><52>?c2i_orexax0os
Jan  1 22:27:13 xeno ><41>?c2i_oe+2/2 c2< fb7] pt_cehx/1 xrs< f00] f_lsi_i+2/2 c2< caf] _nrdwt+b004[1ea rvi_e0ax0
4[1fa drdwt+a0a< f00] f_lsi_i+2/2 c2< c83] pn+1004[809 osbllka+400os
Jan  1 22:27:13 xeno ><4c>?c2dlksrkgx/b[f]4[1f0 v_ivx/6< f46] sv_i.o.02x0nd< c57] h_o_ux/1
Jan  1 22:27:13 xeno ><0e>?eronx/8< ff0] f_rp+100os
Jan  1 22:27:13 xeno ><3b>?f_e05x0nd< f42] swt0103[s
Jan  1 22:27:13 xeno ><3b>?f3r_i+b05[s
Jan  1 22:27:13 xeno ><3f>?fs_cereg0ex0nd< f43] sdpc01x0nd< f9f] cueit03x up
Jan  1 22:27:13 xeno ><2a>?vpcsx7x0sr]4[99c ndx/1 f]4[140 cpt00x
Jan  1 22:27:13 xeno ><30>?f_uo+300nd< c4c] ha04x
Jan  1 22:27:13 xeno ><0b>?te_rrnx0x04[156 kn_rdee0/d0o:003309f944c2e749e689c4c9c40b44c9440bd0000330             
        0I ff1]c2e_tdx5x0os :P0:ea
Jan  1 22:27:55 xeno >-e a 4bc17]-<6>Shorewall:fw2net:ACCEPT:IN= OUT=eth1 SRC=81.161.186.199 DST=208.83.137.115 L
EN=60 TOS=0x00 PREC=0x00 TTL=64 ID=61611 DF PROTO=TCP SPT=42427 DPT=2703 WINDOW=14600 RES=0x00 SYN URGP=0 
Jan  1 22:27:56 xeno kernel: hrwl:wntACP:N U=t1SC8.6.8.9 S=0.3161 E=0TS00 RC00 T=4I=23 FPOOT P=19 P=73WNO=40 E=x0
SNUG= <>Soealf2e:CETI=OTeh R=1111619DT288.315LN6 O=x0PE=00 T=4I=7 FPOOTPST449DT20 IDW160RS00 Y RP0


Will try to see if I can reproduce the issue and upgrade kernel to the stable 3.1.6 on gentoo and see if that makes any difference.

Else if someone knows if this issue has been resolved in later version, I am willing to try that ?
Comment 1 Alan 2012-08-30 14:05:30 UTC
Is this still seen or did your 3.1.6 update fix it ?
Comment 2 Sonni 2012-08-30 14:25:57 UTC
My upgrade to 3.1.6 did not resolve this, been a while now, but I seams to recall similar or other issues with 3.1.6.

Currently I am running 3.0.35, had upgraded my former NFS clients to use OCFS2.

I can give it a try in the coming week to see it still fails with a large NFS write.
Comment 3 Sonni 2012-09-15 16:27:19 UTC
I have tried but and am unable to reproduce this issue on kernel 3.0.15.

Note You need to log in before you can comment on or make changes to this bug.