Hello, I have two sites interconnected using ipsec (libreswan) the situation is as follows: X <=> (a) <=> (Internet) <=> (b) <=> Y So you have two gateways a and b connected to the internet and their corresponding internal subnets X and Y. The gateway a is connected to the provider p using pppoe. The ipsec tunnel is created between a and b to interconnect subnets X and Y. When gateway b with internal address y itself is communication to the gateway a using its internal address x. Addresses x and y are defined by leftsourceif and rightsourceip in the libreswan configuration, you get this behavior: b# ping -M do x -s 1392 -c 1 PING x (x.x.x.x) 1392(1420) bytes of data. --- ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms b# ping -M do a -s 1460 -c 3 PING a (a.a.a.a) 1460(1488) bytes of data. From p (p.p.p.p) icmp_seq=1 Frag needed and DF set (mtu = 1480) ping: local error: message too long, mtu=1480 ping: local error: message too long, mtu=1480 --- ping statistics --- 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2014ms b# ping -M do x -s 1392 -c 3 PING x (x.x.x.x) 1392(1420) bytes of data. ping: local error: message too long, mtu=1418 ping: local error: message too long, mtu=1418 ping: local error: message too long, mtu=1418 --- ping statistics --- 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2046ms Legend: x.x.x.x is an inner ip address if the gateway (a) (or x from the inside). a.a.a.a is an outer address of the gateway (a). p.p.p.p is some address in the provider's network of the (a) side. So definitely the ipsec tunnel is aware of the mtu only when some outer communication is in progress. The inner communication itself is not aware of icmp packets using for PMTU discovery. I had also a situation when also the outer pings did not help the ipsec to be aware of the MTU and after reboot it started to behave like discribed again. Did I describe it understandably or should I clarify things? Thanks Marek
Hi Marek! Could you please provide routing information for both cases using ip route get x.x.x.x ip route get a.a.a.a and ip route list The best would be to get this information before and after PMTU is active Thanks
Hi Vadim, you mean to run the ip route get x.x.x.x and ip route get a.a.a.a to be run on gateway (a) or (b)? I suspect (b). The ip route list should contain all interfaces or should I filter out the relevant ones? The list could be pretty large on gateway (a). The side (b) is pretty simple. I hope I will be able to get the data "before" on side (b) because the ssh connection usually freezes until the outer ping is run. There is a change on side (b). The new provider does not use MTU 1500. There is some ipv6 tunnel used to tunnel ipv4 traffic and the final MTU is even less then on side (a) using pppoe. The behavior is the same as it was before with MTU 1500 on side (b). I can see the ICMP unreachable in tcpdump, but it is ignored until communication outside of ipsec is present. The ipsec is nat-t udp on port 4500 to be more specific. Thanks Marek Sent with ProtonMail Secure Email. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Wednesday, July 7th, 2021 at 20:43, <bugzilla-daemon@bugzilla.kernel.org> wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=213669 > > Vadim Fedorenko (vfedorenko@novek.ru) changed: > > What |Removed |Added > > ------------------------------------------------------------------------------------------------------------------------ > > CC| |vfedorenko@novek.ru > > > --- Comment #1 from Vadim Fedorenko (vfedorenko@novek.ru) --- > > Hi Marek! > > Could you please provide routing information for both cases using > > ip route get x.x.x.x > > ip route get a.a.a.a > > and > > ip route list > > The best would be to get this information before and after PMTU is active > > Thanks > > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > You may reply to this email to add a comment. > > You are receiving this mail because: > > You reported the bug.
(In reply to marek.gresko from comment #2) > Hi Vadim, > > you mean to run the ip route get x.x.x.x and ip route get a.a.a.a to be run > on gateway (a) or (b)? I suspect (b). Yes, it's all about router (b) where you observe the problem > The ip route list should contain all interfaces or should I filter out the > relevant ones? The list could be pretty large on gateway (a). The side (b) > is pretty simple. Again, router (b), you may filter out relevant routes only > There is a change on side (b). The new provider does not use MTU 1500. There > is some ipv6 tunnel used to tunnel ipv4 traffic and the final MTU is even > less then on side (a) using pppoe. The behavior is the same as it was before > with MTU 1500 on side (b). I can see the ICMP unreachable in tcpdump, but it > is ignored until communication outside of ipsec is present. The ipsec is > nat-t udp on port 4500 to be more specific. It's ok to have changes. I would like to see network configuration with all layers of encapsulation. pcap file could also help a lot
Hello, before MTU learning: ip route get x.x.x.x x.x.x.x via b.b.b.d dev enp2s0 src y.y.y.y uid 0 cache ip route get a.a.a.a a.a.a.a via b.b.b.d dev enp2s0 src b.b.b.b uid 0 cache ip route list default via b.b.b.d dev enp2s0 proto dhcp metric 100 b.b.b.0/24 dev enp2s0 proto kernel scope link src b.b.b.b metric 100 x.x.x.0/24 via b.b.b.d dev enp2s0 src y.y.y.y after MTU learning: ip route get x.x.x.x x.x.x.x via b.b.b.d dev enp2s0 src y.y.y.y uid 0 cache ip route get a.a.a.a a.a.a.a via b.b.b.d dev enp2s0 src b.b.b.b uid 0 cache expires 590sec mtu 1444 ip route list default via b.b.b.d dev enp2s0 proto dhcp metric 100 b.b.b.0/24 dev enp2s0 proto kernel scope link src b.b.b.b metric 100 x.x.x.0/24 via b.b.b.d dev enp2s0 src y.y.y.y Marek
tcpdump -nnvv -i enp2s0 icmp dropped privs to tcpdump tcpdump: listening on enp2s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 20:21:34.748792 IP (tos 0xc0, ttl 64, id 59803, offset 0, flags [none], proto ICMP (1), length 576) b.b.b.d > b.b.b.b: ICMP a.a.a.a unreachable - need to frag (mtu 1444), length 556 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 1448) b.b.b.b.4500 > a.a.a.a.4500: [no cksum] UDP-encap: ESP(spi=0xaaaaaaaa,seq=0xaaa), length 1420 ^C 1 packet captured 1 packet received by filter 0 packets dropped by kernel Strange it is in reverse order.... Marek
I can confirm regression in current stable kernel. I would suggest you to downgrade to latest LTS (5.10.x) as it is not affected
Hello, I can live with the bug, since workaround using constant ping is available. Is there some ETA in which version the problem will be fixed? Thanks Marek Sent with ProtonMail Secure Email. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Saturday, July 10th, 2021 at 4:36, <bugzilla-daemon@bugzilla.kernel.org> wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=213669 > > --- Comment #6 from Vadim Fedorenko (vfedorenko@novek.ru) --- > > I can confirm regression in current stable kernel. I would suggest you to > > downgrade to latest LTS (5.10.x) as it is not affected > > > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > You may reply to this email to add a comment. > > You are receiving this mail because: > > You reported the bug.
The fix is commited to -net branch, will go stable versions later
Could you, please, notify here, when merge occurs and specify version? Thanks Marek
The fix was merged to v5.13.6 and will not be merged to 5.12.x because of EOL
Hello, I confirm kernel 5.13.6-200.fc34 fixes the problem. Firstly I did not notice, because I pinged with -c 1 and the first response did not come. Thanks Marek