Bug 5936 - Openswan tunnels + netfilter problem
Summary: Openswan tunnels + netfilter problem
Status: CLOSED CODE_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: Netfilter/Iptables (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Patrick McHardy
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-22 12:02 UTC by Domenico
Modified: 2006-04-22 13:06 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.16-rc1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
[NETFILTER]: Fix xfrm lookup after SNAT (3.96 KB, patch)
2006-02-15 10:27 UTC, Patrick McHardy
Details | Diff
[XFRM]: Fix SNAT-related crash in xfrm4_output_finish (7.42 KB, patch)
2006-02-15 10:27 UTC, Patrick McHardy
Details | Diff

Description Domenico 2006-01-22 12:02:18 UTC
Most recent kernel where this bug did not occur:
Distribution: Fedora Core 4
Hardware Environment: Athlon 64, SATA HDD but occurs also with only ATA Drives
Software Environment:
Problem Description:using a tunnel (ping through it), trying to using another 
the kernel Ooops

Jan 22 20:23:17 router kernel: Unable to handle kernel NULL pointer 
dereference at 000000000000009c RIP:
Jan 22 20:23:17 router kernel: <ffffffff80345195>{xfrm4_output_finish+293}
Jan 22 20:23:17 router kernel: PGD 35377067 PUD 0
Jan 22 20:23:17 router kernel: Oops: 0000 [6]
Jan 22 20:23:17 router kernel: CPU 0
Jan 22 20:23:17 router kernel: Modules linked in: af_key deflate 
zlib_deflate twofish serpent blowfish sha256 crypto_null ae$
Jan 22 20:23:17 router kernel: Pid: 9481, comm: ping Not tainted 2.6.16-rc1 
#2
Jan 22 20:23:17 router kernel: RIP: 0010:[<ffffffff80345195>] 
<ffffffff80345195>{xfrm4_output_finish+293}
Jan 22 20:23:17 router kernel: RSP: 0018:ffff810030c19a78  EFLAGS: 00010297
Jan 22 20:23:17 router kernel: RAX: 0000000000000000 RBX: ffff810032242c3c 
RCX: 0000000000000001
Jan 22 20:23:17 router kernel: RDX: 0000000000000001 RSI: 0000000000000449 
RDI: ffff81003cba4d80
Jan 22 20:23:17 router kernel: RBP: ffff81003cba4d80 R08: 0000000000001d4b 
R09: ffffffff804ac460
Jan 22 20:23:17 router kernel: R10: 000000000000001a R11: ffff810032242878 
R12: 0000000000000000
Jan 22 20:23:17 router kernel: R13: 0000000000000040 R14: 0000000000000040 
R15: ffff810038dff9c0
Jan 22 20:23:17 router kernel: FS:  00002aeb6a96fd00(0000) 
GS:ffffffff804ba000(0000) knlGS:0000000000000000
Jan 22 20:23:17 router kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Jan 22 20:23:17 router kernel: CR2: 000000000000009c CR3: 0000000038565000 
CR4: 00000000000006e0
Jan 22 20:23:17 router kernel: Process ping (pid: 9481, threadinfo 
ffff810030c18000, task ffff81003ba90180)
Jan 22 20:23:17 router kernel: Stack: ffffffff80345070 0000000080000000 
00000002804ac450 ffffffff80307864
Jan 22 20:23:17 router kernel:        ffff81003cba4d80 ffff810032242c3c 
ffff81003ed74a40 ffff81003868f0c0
Jan 22 20:23:17 router kernel:        0000000000000040 0000000000000040
Jan 22 20:23:17 router kernel: Call Trace: 
<ffffffff80345070>{xfrm4_output_finish+0}
Jan 22 20:23:17 router kernel:        <ffffffff80307864>{nf_hook_slow+100} 
<ffffffff803454b4>{xfrm4_output+84}
Jan 22 20:23:17 router kernel: 
<ffffffff8031289f>{ip_push_pending_frames+895} 
<ffffffff8032cf95>{raw_sendmsg+1685}
Jan 22 20:23:17 router kernel: 
<ffffffff803484e9>{xfrm_state_get_afinfo+73} 
<ffffffff8034a16c>{xfrm_state_find+2588}
Jan 22 20:23:17 router kernel:        <ffffffff802e28fa>{sock_sendmsg+266} 
<ffffffff80127e3d>{try_to_wake_up+301}
Jan 22 20:23:17 router kernel: 
<ffffffff80140a10>{autoremove_wake_function+0} 
<ffffffff80357f99>{_read_unlock_irq+9}
Jan 22 20:23:17 router kernel:        <ffffffff80151f46>{filemap_nopage+390} 
<ffffffff802e1935>{move_addr_to_kernel+37}
Jan 22 20:23:17 router kernel:        <ffffffff802e2f9a>{sys_sendmsg+586} 
<ffffffff80357ea9>{_spin_lock_irqsave+9}
Jan 22 20:23:17 router kernel: 
<ffffffff80140bb9>{remove_wait_queue+25} 
<ffffffff80357ea9>{_spin_lock_irqsave+9}
Jan 22 20:23:17 router kernel:        <ffffffff80203a81>{__up_read+33} 
<ffffffff80359583>{do_page_fault+1139}
Jan 22 20:23:17 router kernel:        <ffffffff802e4f29>{release_sock+25} 
<ffffffff80357f59>{_spin_unlock_irq+9}
Jan 22 20:23:17 router kernel:        <ffffffff8010ab12>{system_call+126}
Jan 22 20:23:17 router kernel:
Jan 22 20:23:17 router kernel: Code: 41 80 bc 24 9c 00 00 00 00 74 57 48 8d 
4d 58 48 8b 75 38 0f
Jan 22 20:23:17 router kernel: RIP 
<ffffffff80345195>{xfrm4_output_finish+293} RSP <ffff810030c19a78>
Jan 22 20:23:17 router kernel: CR2: 000000000000009c
Jan 22 20:24:42 router rmmod: ERROR: Module xfrm4_tunnel is in use


Steps to reproduce: With openswan running use two tunnels togheter, the 
problem occours on both gateway but separately
Comment 1 Domenico 2006-01-22 14:59:59 UTC
Here my lsmod status:

Module                  Size  Used by
af_key                 38804  0
deflate                 5120  1
zlib_deflate           22560  1 deflate
twofish                45824  0
serpent                20480  0
blowfish                9728  0
sha256                  9728  0
crypto_null             3712  0
aes                    27688  11
des                    17792  0
xfrm4_tunnel            5384  7
ipcomp                  9232  4
esp4                    9984  11
ah4                     7808  0
ipt_LOG                 8320  3
xt_pkttype              2816  3
iptable_mangle          3968  1
xt_MARK                 3712  1
xt_limit                3712  1
xt_state                3072  3
xt_tcpudp               4480  8
ipt_MASQUERADE          4736  2
iptable_nat             9860  1
ip_nat                 21784  2 ipt_MASQUERADE,iptable_nat
ip_conntrack           62632  4 xt_state,ipt_MASQUERADE,iptable_nat,ip_nat
nfnetlink               8392  2 ip_nat,ip_conntrack
iptable_filter          4096  1
ip_tables              14560  3 iptable_mangle,iptable_nat,iptable_filter
x_tables               16136  9
ipt_LOG,xt_pkttype,xt_MARK,xt_limit,xt_state,xt_tcpudp,ipt_MASQUERADE,iptable_nat,ip_tables
ipv6                  288736  12
parport_pc             32236  1
lp                     16064  0
parport                44172  2 parport_pc,lp
autofs4                23944  2
dm_mod                 62280  0
video                  18568  0
button                  8352  0
battery                11272  0
ac                      6280  0
ohci_hcd               23300  0
ehci_hcd               36360  0
i2c_nforce2             8960  0
i2c_core               25728  1 i2c_nforce2
shpchp                 50048  0
snd_intel8x0           37672  0
snd_ac97_codec        109884  1 snd_intel8x0
snd_ac97_bus            3584  1 snd_ac97_codec
snd_seq_dummy           4868  0
snd_seq_oss            36964  0
snd_seq_midi_event      9472  1 snd_seq_oss
snd_seq                61208  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
snd_seq_device         11024  3 snd_seq_dummy,snd_seq_oss,snd_seq
snd_pcm_oss            58400  0
snd_mixer_oss          19968  1 snd_pcm_oss
snd_pcm               103176  3 snd_intel8x0,snd_ac97_codec,snd_pcm_oss
snd_timer              28296  2 snd_seq,snd_pcm
snd                    67168  9
snd_intel8x0,snd_ac97_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_
pcm,snd_timer
soundcore              12192  1 snd
snd_page_alloc         12816  2 snd_intel8x0,snd_pcm
r8169                  33672  0
3c59x                  51124  0
mii                     7168  1 3c59x
floppy                 74264  0
ext3                  140944  7
jbd                    64296  1 ext3
raid1                  24192  7
sata_nv                11268  16
libata                 65944  1 sata_nv
sd_mod                 19456  18
scsi_mod              156248  2 libata,sd_mod
Comment 2 Domenico 2006-01-22 15:10:24 UTC
I think the problem is on iptables rules, now I'm checking every single rule,
when I found the one who cause kernel Ooops I'll post back.
Comment 3 Domenico 2006-01-22 15:35:07 UTC
I think the problem lies on MASQUERADING.

supposing net to net from 10.0.1.0/255.255.255.0 (GW1) to 10.0.0.0/255.255.255.0
(GW2)
both connected to eth1 of their own gateway.
internet gw connected to eth0 of both gateway.

iptables -t nat -A POSTROUTING -s 10.0.0.1 -o eth0 -j MASQUERADE 

ping -I eth1 10.0.1.1 kernel Ooops

iptables -t nat -A POSTROUTING -s 10.0.0.1 -d ! 10.0.1.0/255.255.255.0 -j MASQUERADE

ping -I eth1 10.0.1.1 ok

Hope this can help you.... Can be NAT-T patch issue on 2.6.16-rc1 ? I've
installed openswan 2.4.4 from rpm for fedora core 4 ....

Comment 4 Domenico 2006-01-24 15:26:35 UTC
Ty for your reply...

I'm new to this stuffs so I can't understand it clearly.

Do you think is there a way to obtain two vpn up and running, making
masquerading avoiding this crash ?

Environment description:

GW1 (openswan + firewall  kernel 2.6.16-rc1 on AMD 64)

10.0.0.0/24 on eth1
public ip on eth0 (internet connection)


GW2 GW1 (openswan + firewall  kernel 2.6.16-rc1 on AMD 64)

10.0.1.0/24 on eth1
public ip on eth0 (internet connection)

on both GW iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

I need this one because router on eth0 needs to be masqued for allowing traffic
flowing (CISCO ACL)..

I could not use 
iptables -t nat -A POSTROUTING -o eth0 -d ! REMOTE_LAN_TROUGH_VPN -j MASQUERADE
because I think that without masquerading my router on eth0 will drop esp
package......

TY
Comment 5 Domenico 2006-01-27 16:33:46 UTC
News on this bug ?

Ty

----- Original Message ----- 
From: <bugme-daemon@bugzilla.kernel.org>
To: <webmaster@elnportal.it>
Sent: Wednesday, January 25, 2006 12:26 AM
Subject: [Bug 5936] Openswan tunnels + netfilter problem


> http://bugzilla.kernel.org/show_bug.cgi?id=5936
>
>
>
>
>
> ------- Additional Comments From webmaster@elnportal.it  2006-01-24 
> 15:26 -------
> Ty for your reply...
>
> I'm new to this stuffs so I can't understand it clearly.
>
> Do you think is there a way to obtain two vpn up and running, making
> masquerading avoiding this crash ?
>
> Environment description:
>
> GW1 (openswan + firewall  kernel 2.6.16-rc1 on AMD 64)
>
> 10.0.0.0/24 on eth1
> public ip on eth0 (internet connection)
>
>
> GW2 GW1 (openswan + firewall  kernel 2.6.16-rc1 on AMD 64)
>
> 10.0.1.0/24 on eth1
> public ip on eth0 (internet connection)
>
> on both GW iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
>
> I need this one because router on eth0 needs to be masqued for allowing 
> traffic
> flowing (CISCO ACL)..
>
> I could not use
> iptables -t nat -A POSTROUTING -o eth0 -d ! REMOTE_LAN_TROUGH_VPN -j 
> MASQUERADE
> because I think that without masquerading my router on eth0 will drop esp
> package......
>
> TY
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
> 

Comment 6 Harald Welte 2006-01-31 02:17:10 UTC
reassigning this to kaber, he's the netfilter/ipsec guy.
Comment 7 Patrick McHardy 2006-01-31 05:26:59 UTC
The problem happens when a packet which matches a policy is SNATed and doesn't
match any policy afterwards. I'll probably get a fix done by tonight.
Comment 8 Domenico 2006-02-02 11:37:19 UTC
news about that ? 
thank you.
Comment 9 Domenico 2006-02-10 11:17:29 UTC
News on this bug ??

I really need it fixed, pls hlp.

----- Original Message ----- 
From: <bugme-daemon@bugzilla.kernel.org>
To: <webmaster@elnportal.it>
Sent: Thursday, February 02, 2006 8:37 PM
Subject: [Bug 5936] Openswan tunnels + netfilter problem


> http://bugzilla.kernel.org/show_bug.cgi?id=5936
>
>
>
>
>
> ------- Additional Comments From webmaster@elnportal.it  2006-02-02 
> 11:37 -------
> news about that ?
> thank you.
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
> 

Comment 10 Patrick McHardy 2006-02-15 10:27:23 UTC
Created attachment 7350 [details]
[NETFILTER]: Fix xfrm lookup after SNAT
Comment 11 Patrick McHardy 2006-02-15 10:27:50 UTC
Created attachment 7351 [details]
[XFRM]: Fix SNAT-related crash in xfrm4_output_finish
Comment 12 Patrick McHardy 2006-02-15 10:28:23 UTC
Sorry for the delay. These two patches should fix the problem.
Comment 13 Domenico 2006-02-18 15:30:10 UTC
No problem, ty for your help.

Bye and really good work !

----- Original Message ----- 
From: <bugme-daemon@bugzilla.kernel.org>
To: <webmaster@elnportal.it>
Sent: Wednesday, February 15, 2006 7:28 PM
Subject: [Bug 5936] Openswan tunnels + netfilter problem


> http://bugzilla.kernel.org/show_bug.cgi?id=5936
>
>
>
>
>
> ------- Additional Comments From kaber@trash.net  2006-02-15 10:28 -------
> Sorry for the delay. These two patches should fix the problem.
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
> 

Comment 14 Adrian Bunk 2006-04-22 13:06:17 UTC
The patches from this bug are already included in kernel 2.6.16.

Note You need to log in before you can comment on or make changes to this bug.