Most recent kernel where this bug did not occur: Distribution: Fedora Core 4 Hardware Environment: Athlon 64, SATA HDD but occurs also with only ATA Drives Software Environment: Problem Description:using a tunnel (ping through it), trying to using another the kernel Ooops Jan 22 20:23:17 router kernel: Unable to handle kernel NULL pointer dereference at 000000000000009c RIP: Jan 22 20:23:17 router kernel: <ffffffff80345195>{xfrm4_output_finish+293} Jan 22 20:23:17 router kernel: PGD 35377067 PUD 0 Jan 22 20:23:17 router kernel: Oops: 0000 [6] Jan 22 20:23:17 router kernel: CPU 0 Jan 22 20:23:17 router kernel: Modules linked in: af_key deflate zlib_deflate twofish serpent blowfish sha256 crypto_null ae$ Jan 22 20:23:17 router kernel: Pid: 9481, comm: ping Not tainted 2.6.16-rc1 #2 Jan 22 20:23:17 router kernel: RIP: 0010:[<ffffffff80345195>] <ffffffff80345195>{xfrm4_output_finish+293} Jan 22 20:23:17 router kernel: RSP: 0018:ffff810030c19a78 EFLAGS: 00010297 Jan 22 20:23:17 router kernel: RAX: 0000000000000000 RBX: ffff810032242c3c RCX: 0000000000000001 Jan 22 20:23:17 router kernel: RDX: 0000000000000001 RSI: 0000000000000449 RDI: ffff81003cba4d80 Jan 22 20:23:17 router kernel: RBP: ffff81003cba4d80 R08: 0000000000001d4b R09: ffffffff804ac460 Jan 22 20:23:17 router kernel: R10: 000000000000001a R11: ffff810032242878 R12: 0000000000000000 Jan 22 20:23:17 router kernel: R13: 0000000000000040 R14: 0000000000000040 R15: ffff810038dff9c0 Jan 22 20:23:17 router kernel: FS: 00002aeb6a96fd00(0000) GS:ffffffff804ba000(0000) knlGS:0000000000000000 Jan 22 20:23:17 router kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 22 20:23:17 router kernel: CR2: 000000000000009c CR3: 0000000038565000 CR4: 00000000000006e0 Jan 22 20:23:17 router kernel: Process ping (pid: 9481, threadinfo ffff810030c18000, task ffff81003ba90180) Jan 22 20:23:17 router kernel: Stack: ffffffff80345070 0000000080000000 00000002804ac450 ffffffff80307864 Jan 22 20:23:17 router kernel: ffff81003cba4d80 ffff810032242c3c ffff81003ed74a40 ffff81003868f0c0 Jan 22 20:23:17 router kernel: 0000000000000040 0000000000000040 Jan 22 20:23:17 router kernel: Call Trace: <ffffffff80345070>{xfrm4_output_finish+0} Jan 22 20:23:17 router kernel: <ffffffff80307864>{nf_hook_slow+100} <ffffffff803454b4>{xfrm4_output+84} Jan 22 20:23:17 router kernel: <ffffffff8031289f>{ip_push_pending_frames+895} <ffffffff8032cf95>{raw_sendmsg+1685} Jan 22 20:23:17 router kernel: <ffffffff803484e9>{xfrm_state_get_afinfo+73} <ffffffff8034a16c>{xfrm_state_find+2588} Jan 22 20:23:17 router kernel: <ffffffff802e28fa>{sock_sendmsg+266} <ffffffff80127e3d>{try_to_wake_up+301} Jan 22 20:23:17 router kernel: <ffffffff80140a10>{autoremove_wake_function+0} <ffffffff80357f99>{_read_unlock_irq+9} Jan 22 20:23:17 router kernel: <ffffffff80151f46>{filemap_nopage+390} <ffffffff802e1935>{move_addr_to_kernel+37} Jan 22 20:23:17 router kernel: <ffffffff802e2f9a>{sys_sendmsg+586} <ffffffff80357ea9>{_spin_lock_irqsave+9} Jan 22 20:23:17 router kernel: <ffffffff80140bb9>{remove_wait_queue+25} <ffffffff80357ea9>{_spin_lock_irqsave+9} Jan 22 20:23:17 router kernel: <ffffffff80203a81>{__up_read+33} <ffffffff80359583>{do_page_fault+1139} Jan 22 20:23:17 router kernel: <ffffffff802e4f29>{release_sock+25} <ffffffff80357f59>{_spin_unlock_irq+9} Jan 22 20:23:17 router kernel: <ffffffff8010ab12>{system_call+126} Jan 22 20:23:17 router kernel: Jan 22 20:23:17 router kernel: Code: 41 80 bc 24 9c 00 00 00 00 74 57 48 8d 4d 58 48 8b 75 38 0f Jan 22 20:23:17 router kernel: RIP <ffffffff80345195>{xfrm4_output_finish+293} RSP <ffff810030c19a78> Jan 22 20:23:17 router kernel: CR2: 000000000000009c Jan 22 20:24:42 router rmmod: ERROR: Module xfrm4_tunnel is in use Steps to reproduce: With openswan running use two tunnels togheter, the problem occours on both gateway but separately
Here my lsmod status: Module Size Used by af_key 38804 0 deflate 5120 1 zlib_deflate 22560 1 deflate twofish 45824 0 serpent 20480 0 blowfish 9728 0 sha256 9728 0 crypto_null 3712 0 aes 27688 11 des 17792 0 xfrm4_tunnel 5384 7 ipcomp 9232 4 esp4 9984 11 ah4 7808 0 ipt_LOG 8320 3 xt_pkttype 2816 3 iptable_mangle 3968 1 xt_MARK 3712 1 xt_limit 3712 1 xt_state 3072 3 xt_tcpudp 4480 8 ipt_MASQUERADE 4736 2 iptable_nat 9860 1 ip_nat 21784 2 ipt_MASQUERADE,iptable_nat ip_conntrack 62632 4 xt_state,ipt_MASQUERADE,iptable_nat,ip_nat nfnetlink 8392 2 ip_nat,ip_conntrack iptable_filter 4096 1 ip_tables 14560 3 iptable_mangle,iptable_nat,iptable_filter x_tables 16136 9 ipt_LOG,xt_pkttype,xt_MARK,xt_limit,xt_state,xt_tcpudp,ipt_MASQUERADE,iptable_nat,ip_tables ipv6 288736 12 parport_pc 32236 1 lp 16064 0 parport 44172 2 parport_pc,lp autofs4 23944 2 dm_mod 62280 0 video 18568 0 button 8352 0 battery 11272 0 ac 6280 0 ohci_hcd 23300 0 ehci_hcd 36360 0 i2c_nforce2 8960 0 i2c_core 25728 1 i2c_nforce2 shpchp 50048 0 snd_intel8x0 37672 0 snd_ac97_codec 109884 1 snd_intel8x0 snd_ac97_bus 3584 1 snd_ac97_codec snd_seq_dummy 4868 0 snd_seq_oss 36964 0 snd_seq_midi_event 9472 1 snd_seq_oss snd_seq 61208 5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event snd_seq_device 11024 3 snd_seq_dummy,snd_seq_oss,snd_seq snd_pcm_oss 58400 0 snd_mixer_oss 19968 1 snd_pcm_oss snd_pcm 103176 3 snd_intel8x0,snd_ac97_codec,snd_pcm_oss snd_timer 28296 2 snd_seq,snd_pcm snd 67168 9 snd_intel8x0,snd_ac97_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_ pcm,snd_timer soundcore 12192 1 snd snd_page_alloc 12816 2 snd_intel8x0,snd_pcm r8169 33672 0 3c59x 51124 0 mii 7168 1 3c59x floppy 74264 0 ext3 140944 7 jbd 64296 1 ext3 raid1 24192 7 sata_nv 11268 16 libata 65944 1 sata_nv sd_mod 19456 18 scsi_mod 156248 2 libata,sd_mod
I think the problem is on iptables rules, now I'm checking every single rule, when I found the one who cause kernel Ooops I'll post back.
I think the problem lies on MASQUERADING. supposing net to net from 10.0.1.0/255.255.255.0 (GW1) to 10.0.0.0/255.255.255.0 (GW2) both connected to eth1 of their own gateway. internet gw connected to eth0 of both gateway. iptables -t nat -A POSTROUTING -s 10.0.0.1 -o eth0 -j MASQUERADE ping -I eth1 10.0.1.1 kernel Ooops iptables -t nat -A POSTROUTING -s 10.0.0.1 -d ! 10.0.1.0/255.255.255.0 -j MASQUERADE ping -I eth1 10.0.1.1 ok Hope this can help you.... Can be NAT-T patch issue on 2.6.16-rc1 ? I've installed openswan 2.4.4 from rpm for fedora core 4 ....
Ty for your reply... I'm new to this stuffs so I can't understand it clearly. Do you think is there a way to obtain two vpn up and running, making masquerading avoiding this crash ? Environment description: GW1 (openswan + firewall kernel 2.6.16-rc1 on AMD 64) 10.0.0.0/24 on eth1 public ip on eth0 (internet connection) GW2 GW1 (openswan + firewall kernel 2.6.16-rc1 on AMD 64) 10.0.1.0/24 on eth1 public ip on eth0 (internet connection) on both GW iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE I need this one because router on eth0 needs to be masqued for allowing traffic flowing (CISCO ACL).. I could not use iptables -t nat -A POSTROUTING -o eth0 -d ! REMOTE_LAN_TROUGH_VPN -j MASQUERADE because I think that without masquerading my router on eth0 will drop esp package...... TY
News on this bug ? Ty ----- Original Message ----- From: <bugme-daemon@bugzilla.kernel.org> To: <webmaster@elnportal.it> Sent: Wednesday, January 25, 2006 12:26 AM Subject: [Bug 5936] Openswan tunnels + netfilter problem > http://bugzilla.kernel.org/show_bug.cgi?id=5936 > > > > > > ------- Additional Comments From webmaster@elnportal.it 2006-01-24 > 15:26 ------- > Ty for your reply... > > I'm new to this stuffs so I can't understand it clearly. > > Do you think is there a way to obtain two vpn up and running, making > masquerading avoiding this crash ? > > Environment description: > > GW1 (openswan + firewall kernel 2.6.16-rc1 on AMD 64) > > 10.0.0.0/24 on eth1 > public ip on eth0 (internet connection) > > > GW2 GW1 (openswan + firewall kernel 2.6.16-rc1 on AMD 64) > > 10.0.1.0/24 on eth1 > public ip on eth0 (internet connection) > > on both GW iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE > > I need this one because router on eth0 needs to be masqued for allowing > traffic > flowing (CISCO ACL).. > > I could not use > iptables -t nat -A POSTROUTING -o eth0 -d ! REMOTE_LAN_TROUGH_VPN -j > MASQUERADE > because I think that without masquerading my router on eth0 will drop esp > package...... > > TY > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. >
reassigning this to kaber, he's the netfilter/ipsec guy.
The problem happens when a packet which matches a policy is SNATed and doesn't match any policy afterwards. I'll probably get a fix done by tonight.
news about that ? thank you.
News on this bug ?? I really need it fixed, pls hlp. ----- Original Message ----- From: <bugme-daemon@bugzilla.kernel.org> To: <webmaster@elnportal.it> Sent: Thursday, February 02, 2006 8:37 PM Subject: [Bug 5936] Openswan tunnels + netfilter problem > http://bugzilla.kernel.org/show_bug.cgi?id=5936 > > > > > > ------- Additional Comments From webmaster@elnportal.it 2006-02-02 > 11:37 ------- > news about that ? > thank you. > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. >
Created attachment 7350 [details] [NETFILTER]: Fix xfrm lookup after SNAT
Created attachment 7351 [details] [XFRM]: Fix SNAT-related crash in xfrm4_output_finish
Sorry for the delay. These two patches should fix the problem.
No problem, ty for your help. Bye and really good work ! ----- Original Message ----- From: <bugme-daemon@bugzilla.kernel.org> To: <webmaster@elnportal.it> Sent: Wednesday, February 15, 2006 7:28 PM Subject: [Bug 5936] Openswan tunnels + netfilter problem > http://bugzilla.kernel.org/show_bug.cgi?id=5936 > > > > > > ------- Additional Comments From kaber@trash.net 2006-02-15 10:28 ------- > Sorry for the delay. These two patches should fix the problem. > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. >
The patches from this bug are already included in kernel 2.6.16.