Bug 35862
Summary: | arp requests from wrong src IP | ||
---|---|---|---|
Product: | Networking | Reporter: | Victor Mataré (matare) |
Component: | IPV4 | Assignee: | Stephen Hemminger (stephen) |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | akpm, alan |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: |
Description
Victor Mataré
2011-05-25 23:27:46 UTC
Sorry, forgot the kernel version. The host above runs a gentoo 2.6.36-hardened-r9 kernel, while the other one (which is not shown but exhibits the same behaviour) has 2.6.29-gentoo-r5. (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Wed, 25 May 2011 23:27:48 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=35862 > > Summary: arp requests from wrong src IP > Product: Networking > Version: 2.5 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: IPV4 > AssignedTo: shemminger@linux-foundation.org > ReportedBy: matare@lih.rwth-aachen.de > Regression: No > > > I switched a host's ip address from 137.226.164.13 to 137.226.164.2. The .13 > IP > now belongs to the host that had .2 before (I swapped them). Now both hosts > still arp from their old IPs although ifconfig as well as ip clearly tell > otherwise. Examining the host which now has 137.226.164.13: > > # ip addr show dev eth0 > 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen > 1000 > link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff > inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0 > inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0 > > but arping defaults to the old src IP (.13). I can manually correct this with > the -s parameter, but it looks like linux still believes that 137.226.164.13 > is > this host's ip address. When I try to manually correct the arp table: > # arp -s 137.226.164.13 00:30:48:70:91:95 > SIOCSARP: Invalid argument > # arp -n 137.226.164.13 > 137.226.164.13 (137.226.164.13) -- no entry > > And this is what arping does: > # tcpdump -ieth0 -c1 -s0 -vvv -n arp & (sleep 1; arping 137.226.164.13 &> > /dev/null) > [1] 2217 > tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 > bytes > 01:14:37.785126 arp who-has 137.226.164.13 (ff:ff:ff:ff:ff:ff) tell > 137.226.164.13 > > Also, ifconfig doesn't even show the second IP address: > # ifconfig eth0 > eth0 Link encap:Ethernet HWaddr 00:e0:81:41:1f:e4 > inet addr:137.226.164.2 Bcast:137.226.164.255 Mask:255.255.255.0 > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:103996345 errors:0 dropped:0 overruns:0 frame:0 > TX packets:122352625 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:52478932087 (48.8 GiB) TX bytes:110248931949 (102.6 GiB) > Interrupt:24 > > What's going on here? If this is by design, it's very unintuitive behaviour. > From: Andrew Morton <akpm@linux-foundation.org> Date: Wed, 25 May 2011 16:31:37 -0700 >> I switched a host's ip address from 137.226.164.13 to 137.226.164.2. The .13 >> IP >> now belongs to the host that had .2 before (I swapped them). Now both hosts >> still arp from their old IPs although ifconfig as well as ip clearly tell >> otherwise. Examining the host which now has 137.226.164.13: >> >> # ip addr show dev eth0 >> 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen >> 1000 >> link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff >> inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0 >> inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0 If you keep the old IP address around it remains as the "primary" IP address. You have to explicitly remove the original IP address from the interface first, then add the new one, in order for the new one to become the "primary" Not a bug, please close this. For some reason my mail reply doesn't appear here, so I'll repeat:
On Thursday, 26.05.2011 03:52:22 David Miller wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
> Date: Wed, 25 May 2011 16:31:37 -0700
>
> >> I switched a host's ip address from 137.226.164.13 to 137.226.164.2. The
> .13 IP
> >> now belongs to the host that had .2 before (I swapped them). Now both
> hosts
> >> still arp from their old IPs although ifconfig as well as ip clearly tell
> >> otherwise. Examining the host which now has 137.226.164.13:
> >>
> >> # ip addr show dev eth0
> >> 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen
> 1000
> >> link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
> >> inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
> >> inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
>
> If you keep the old IP address around it remains as the "primary"
> IP address.
>
> You have to explicitly remove the original IP address from the
> interface first, then add the new one, in order for the new
> one to become the "primary"
>
> Not a bug, please close this.
>
Sorry, there's a typo. It's supposed to read:
[...]
Examining the host which now has 137.226.164.2 (used to have 137.226.164.13):
# ip addr show dev eth0
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
[...]
Sorry, got confused with all the swapping. I'm not keeping the old address around, it's completely *gone*, from both ifconfig and ip. But still it's being used as arp src address. That's what this bug is about. Sorry for the confusion. Reopening, hoping the issue is clear now.
On Thursday, 26.05.2011 03:52:22 David Miller wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
> Date: Wed, 25 May 2011 16:31:37 -0700
>
> >> I switched a host's ip address from 137.226.164.13 to 137.226.164.2. The
> .13 IP
> >> now belongs to the host that had .2 before (I swapped them). Now both
> hosts
> >> still arp from their old IPs although ifconfig as well as ip clearly tell
> >> otherwise. Examining the host which now has 137.226.164.13:
> >>
> >> # ip addr show dev eth0
> >> 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen
> 1000
> >> link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
> >> inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
> >> inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
>
> If you keep the old IP address around it remains as the "primary"
> IP address.
>
> You have to explicitly remove the original IP address from the
> interface first, then add the new one, in order for the new
> one to become the "primary"
>
> Not a bug, please close this.
>
Sorry, there's a typo. It's supposed to read:
[...]
Examining the host which now has 137.226.164.2 (used to have 137.226.164.13):
# ip addr show dev eth0
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
[...]
Sorry, got confused with all the swapping. I'm *not* keeping the old address around, it's completely *gone*, from both ifconfig and ip. But still it's being used as arp src address. That's what this bug is about. Sorry for the confusion.
Your reply came through OK. bugzilla can be a bit slow at times. Emailed reply-to-all is the right thing to do, thanks. Reply-To: ja@ssi.bg Hello, On Thu, 26 May 2011, Victor Mataré wrote: > Examining the host which now has 137.226.164.2 (used to have 137.226.164.13): > > # ip addr show dev eth0 > 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen > 1000 > link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff > inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0 > inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0 > [...] > > Sorry, got confused with all the swapping. I'm *not* keeping the old address > around, it's completely *gone*, from both ifconfig and ip. But still it's > being used as arp src address. That's what this bug is about. Sorry for the > confusion. It looks strange. Can you confirm the following things: - the kernel version - the order of 'ip' command used to add and change IPs on this box - output of 'ip route list table local' after IPs are changed and before starting arping - output of 'strace arping', I assume it is using getsockname after UDP connect - any reason to use broadcast 137.226.164.255 for all addresses? Regards -- Julian Anastasov <ja@ssi.bg> On Friday, 27.05.2011 07:27:23 Julian Anastasov wrote: > > Hello, > > On Thu, 26 May 2011, Victor Mataré wrote: > > > Examining the host which now has 137.226.164.2 (used to have > 137.226.164.13): > > > > # ip addr show dev eth0 > > 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen > 1000 > > link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff > > inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0 > > inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0 > > [...] > > > > Sorry, got confused with all the swapping. I'm *not* keeping the old > address around, it's completely *gone*, from both ifconfig and ip. But still > it's being used as arp src address. That's what this bug is about. Sorry for > the confusion. > > It looks strange. Can you confirm the following things: > > - the kernel version This host runs 2.6.36-hardened-r9. I'm not sure which vanilla release that's based on, but it's patched with grsec and PAX. However another host which exhibits the exact same behaviour runs 2.6.29-gentoo-r5. This one does not have hardened or grsec, but gentoo patches, so I'd assume this is neither a version- nor a patch-specific problem. > > - the order of 'ip' command used to add and change IPs on this box ok - starting situation was 2 IPs: 137.226.164.13/24 (eth0) and 192.168.23.13/24 (eth0:0) then I did "ifconfig eth0 137.226.164.2 netmask 255.255.255.0" I'm not exactly sure what happened then, but the result was that "ip addr show dev eth0" showed that eth0 still had the old IP address, while ifconfig didn't. Ifconfig was misbehaving in some kind of way, that's why I checked the situation with the ip tool. Then I used ip to configure everything as intended and now I have the situation described in this bug. Note that the server has been in productive use for a week now despite of that. > > - output of 'ip route list table local' after IPs are changed and > before starting arping broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1 broadcast 192.168.23.0 dev eth0 proto kernel scope link src 192.168.23.2 local 192.168.23.2 dev eth0 proto kernel scope host src 192.168.23.2 local 137.226.164.2 dev eth0 proto kernel scope host src 137.226.164.2 local 137.226.164.13 dev eth0 proto kernel scope host src 137.226.164.13 broadcast 192.168.23.255 dev eth0 proto kernel scope link src 192.168.23.2 broadcast 137.226.164.255 dev eth0 proto kernel scope link src 137.226.164.2 broadcast 137.226.164.255 dev eth0 proto kernel scope link src 192.168.23.2 broadcast 127.0.0.0 dev lo proto kernel scope link src 127.0.0.1 local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1 local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 I guess that entry "local 137.226.164.13" shouldn't be there? But shouldn't that be removed automatically when I delete the IP from eth0? > > - output of 'strace arping', I assume it is using getsockname > after UDP connect # strace arping 137.226.164.13 [...] socket(PF_PACKET, SOCK_DGRAM, 0) = 3 setuid(0) = 0 ioctl(3, SIOCGIFINDEX, {ifr_name="eth0", ifr_index=4}) = 0 ioctl(3, SIOCGIFFLAGS, {ifr_name="eth0", ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4 setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "eth0\0", 5) = 0 setsockopt(4, SOL_SOCKET, SO_DONTROUTE, [1], 4) = 0 connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("137.226.164.13")}, 16) = 0 getsockname(4, {sa_family=AF_INET, sin_port=htons(44125), sin_addr=inet_addr("137.226.164.13")}, [16]) = 0 close(4) = 0 bind(3, {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, addr(0)={0, }, 128) = 0 getsockname(3, {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, addr(6)={1, 00e081411fe4}, [18]) = 0 [...] no reply [...] compare that with: # strace arping 137.226.164.3 [...] socket(PF_PACKET, SOCK_DGRAM, 0) = 3 setuid(0) = 0 ioctl(3, SIOCGIFINDEX, {ifr_name="eth0", ifr_index=4}) = 0 ioctl(3, SIOCGIFFLAGS, {ifr_name="eth0", ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4 setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "eth0\0", 5) = 0 setsockopt(4, SOL_SOCKET, SO_DONTROUTE, [1], 4) = 0 connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("137.226.164.3")}, 16) = 0 getsockname(4, {sa_family=AF_INET, sin_port=htons(45467), sin_addr=inet_addr("137.226.164.2")}, [16]) = 0 close(4) = 0 bind(3, {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, addr(0)={0, }, 128) = 0 getsockname(3, {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, addr(6)={1, 00e081411fe4}, [18]) = 0 [...] reply [...] So that's the change in source address, and I guess it's due to the table above? Then this is more like a bug in the "ip" utility? > > - any reason to use broadcast 137.226.164.255 for all addresses? Nope, none at all. I didn't see that because I thought ifconfig and ip use sensible defaults. Well... So thanks, looks like you're pointing in the right direction. Victor Reply-To: ja@ssi.bg Hello, On Sat, 28 May 2011, Victor Mataré wrote: > ok - starting situation was 2 IPs: 137.226.164.13/24 (eth0) and > 192.168.23.13/24 (eth0:0) > then I did "ifconfig eth0 137.226.164.2 netmask 255.255.255.0" > I'm not exactly sure what happened then, but the result was that "ip addr > show dev eth0" showed that eth0 still had the old IP address, while ifconfig > didn't. Ifconfig was misbehaving in some kind of way, that's why I checked > the situation with the ip tool. Then I used ip to configure everything as > intended and now I have the situation described in this bug. Note that the > server has been in productive use for a week now despite of that. > > > > > - output of 'ip route list table local' after IPs are changed and > > before starting arping > > broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1 > broadcast 192.168.23.0 dev eth0 proto kernel scope link src 192.168.23.2 > local 192.168.23.2 dev eth0 proto kernel scope host src 192.168.23.2 > local 137.226.164.2 dev eth0 proto kernel scope host src 137.226.164.2 > local 137.226.164.13 dev eth0 proto kernel scope host src 137.226.164.13 > broadcast 192.168.23.255 dev eth0 proto kernel scope link src 192.168.23.2 > broadcast 137.226.164.255 dev eth0 proto kernel scope link src > 137.226.164.2 > broadcast 137.226.164.255 dev eth0 proto kernel scope link src > 192.168.23.2 > broadcast 127.0.0.0 dev lo proto kernel scope link src 127.0.0.1 > local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1 > local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 > > I guess that entry "local 137.226.164.13" shouldn't be there? But shouldn't > that be removed automatically when I delete the IP from eth0? Yes, this problem looks like what we fixed recently: http://marc.info/?l=linux-netdev&m=129848300922970&w=2 http://marc.info/?l=linux-netdev&m=130048961407666&w=2 http://marc.info/?l=linux-netdev&m=130057251901164&w=2 It can happen only when you add 137.226.164.13 many times with different subnet mask at the same time, eg. /32 and /24. To understand what really happens for your setup we should try commands that reproduce the problem, eg. on some unused device such as eth1 or dummy0. The first link has such test script as example. Leaving such routes should be reproducible. Regards -- Julian Anastasov <ja@ssi.bg> |