Bug 35862 - arp requests from wrong src IP
Summary: arp requests from wrong src IP
Status: RESOLVED OBSOLETE
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-25 23:27 UTC by Victor Mataré
Modified: 2012-06-27 14:04 UTC (History)
2 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Victor Mataré 2011-05-25 23:27:46 UTC
I switched a host's ip address from 137.226.164.13 to 137.226.164.2. The .13 IP now belongs to the host that had .2 before (I swapped them). Now both hosts still arp from their old IPs although ifconfig as well as ip clearly tell otherwise. Examining the host which now has 137.226.164.13:

# ip addr show dev eth0
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
    inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
    inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0

but arping defaults to the old src IP (.13). I can manually correct this with the -s parameter, but it looks like linux still believes that 137.226.164.13 is this host's ip address. When I try to manually correct the arp table:
# arp -s 137.226.164.13 00:30:48:70:91:95
SIOCSARP: Invalid argument
# arp -n 137.226.164.13
137.226.164.13 (137.226.164.13) -- no entry

And this is what arping does:
# tcpdump -ieth0 -c1 -s0 -vvv -n arp & (sleep 1; arping 137.226.164.13 &> /dev/null)
[1] 2217
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
01:14:37.785126 arp who-has 137.226.164.13 (ff:ff:ff:ff:ff:ff) tell 137.226.164.13

Also, ifconfig doesn't even show the second IP address:
# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:e0:81:41:1f:e4  
          inet addr:137.226.164.2  Bcast:137.226.164.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:103996345 errors:0 dropped:0 overruns:0 frame:0
          TX packets:122352625 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:52478932087 (48.8 GiB)  TX bytes:110248931949 (102.6 GiB)
          Interrupt:24

What's going on here? If this is by design, it's very unintuitive behaviour.
Comment 1 Victor Mataré 2011-05-25 23:30:28 UTC
Sorry, forgot the kernel version. The host above runs a gentoo 2.6.36-hardened-r9 kernel, while the other one (which is not shown but exhibits the same behaviour) has 2.6.29-gentoo-r5.
Comment 2 Andrew Morton 2011-05-25 23:32:02 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Wed, 25 May 2011 23:27:48 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=35862
> 
>            Summary: arp requests from wrong src IP
>            Product: Networking
>            Version: 2.5
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: IPV4
>         AssignedTo: shemminger@linux-foundation.org
>         ReportedBy: matare@lih.rwth-aachen.de
>         Regression: No
> 
> 
> I switched a host's ip address from 137.226.164.13 to 137.226.164.2. The .13
> IP
> now belongs to the host that had .2 before (I swapped them). Now both hosts
> still arp from their old IPs although ifconfig as well as ip clearly tell
> otherwise. Examining the host which now has 137.226.164.13:
> 
> # ip addr show dev eth0
> 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen
> 1000
>     link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
>     inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
>     inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
> 
> but arping defaults to the old src IP (.13). I can manually correct this with
> the -s parameter, but it looks like linux still believes that 137.226.164.13
> is
> this host's ip address. When I try to manually correct the arp table:
> # arp -s 137.226.164.13 00:30:48:70:91:95
> SIOCSARP: Invalid argument
> # arp -n 137.226.164.13
> 137.226.164.13 (137.226.164.13) -- no entry
> 
> And this is what arping does:
> # tcpdump -ieth0 -c1 -s0 -vvv -n arp & (sleep 1; arping 137.226.164.13 &>
> /dev/null)
> [1] 2217
> tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535
> bytes
> 01:14:37.785126 arp who-has 137.226.164.13 (ff:ff:ff:ff:ff:ff) tell
> 137.226.164.13
> 
> Also, ifconfig doesn't even show the second IP address:
> # ifconfig eth0
> eth0      Link encap:Ethernet  HWaddr 00:e0:81:41:1f:e4  
>           inet addr:137.226.164.2  Bcast:137.226.164.255  Mask:255.255.255.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:103996345 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:122352625 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000 
>           RX bytes:52478932087 (48.8 GiB)  TX bytes:110248931949 (102.6 GiB)
>           Interrupt:24
> 
> What's going on here? If this is by design, it's very unintuitive behaviour.
>
Comment 3 David S. Miller 2011-05-26 01:52:53 UTC
From: Andrew Morton <akpm@linux-foundation.org>
Date: Wed, 25 May 2011 16:31:37 -0700

>> I switched a host's ip address from 137.226.164.13 to 137.226.164.2. The .13
>> IP
>> now belongs to the host that had .2 before (I swapped them). Now both hosts
>> still arp from their old IPs although ifconfig as well as ip clearly tell
>> otherwise. Examining the host which now has 137.226.164.13:
>> 
>> # ip addr show dev eth0
>> 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen
>> 1000
>>     link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
>>     inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
>>     inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0

If you keep the old IP address around it remains as the "primary"
IP address.

You have to explicitly remove the original IP address from the
interface first, then add the new one, in order for the new
one to become the "primary"

Not a bug, please close this.
Comment 4 Victor Mataré 2011-05-26 02:21:33 UTC
For some reason my mail reply doesn't appear here, so I'll repeat:

On Thursday, 26.05.2011 03:52:22 David Miller wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
> Date: Wed, 25 May 2011 16:31:37 -0700
> 
> >> I switched a host's ip address from 137.226.164.13 to 137.226.164.2. The
> .13 IP
> >> now belongs to the host that had .2 before (I swapped them). Now both
> hosts
> >> still arp from their old IPs although ifconfig as well as ip clearly tell
> >> otherwise. Examining the host which now has 137.226.164.13:
> >> 
> >> # ip addr show dev eth0
> >> 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen
> 1000
> >>     link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
> >>     inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
> >>     inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
> 
> If you keep the old IP address around it remains as the "primary"
> IP address.
> 
> You have to explicitly remove the original IP address from the
> interface first, then add the new one, in order for the new
> one to become the "primary"
> 
> Not a bug, please close this.
> 

Sorry, there's a typo. It's supposed to read:

[...]
Examining the host which now has 137.226.164.2 (used to have 137.226.164.13):

# ip addr show dev eth0
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
     link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
     inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
     inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
[...]

Sorry, got confused with all the swapping. I'm not keeping the old address around, it's completely *gone*, from both ifconfig and ip. But still it's being used as arp src address. That's what this bug is about. Sorry for the confusion. Reopening, hoping the issue is clear now.
Comment 5 Victor Mataré 2011-05-26 02:31:16 UTC
On Thursday, 26.05.2011 03:52:22 David Miller wrote:
> From: Andrew Morton <akpm@linux-foundation.org>
> Date: Wed, 25 May 2011 16:31:37 -0700
> 
> >> I switched a host's ip address from 137.226.164.13 to 137.226.164.2. The
> .13 IP
> >> now belongs to the host that had .2 before (I swapped them). Now both
> hosts
> >> still arp from their old IPs although ifconfig as well as ip clearly tell
> >> otherwise. Examining the host which now has 137.226.164.13:
> >> 
> >> # ip addr show dev eth0
> >> 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen
> 1000
> >>     link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
> >>     inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
> >>     inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
> 
> If you keep the old IP address around it remains as the "primary"
> IP address.
> 
> You have to explicitly remove the original IP address from the
> interface first, then add the new one, in order for the new
> one to become the "primary"
> 
> Not a bug, please close this.
> 

Sorry, there's a typo. It's supposed to read:

[...]
Examining the host which now has 137.226.164.2 (used to have 137.226.164.13):

# ip addr show dev eth0
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
     link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
     inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
     inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
[...]

Sorry, got confused with all the swapping. I'm *not* keeping the old address around, it's completely *gone*, from both ifconfig and ip. But still it's being used as arp src address. That's what this bug is about. Sorry for the confusion.
Comment 6 Andrew Morton 2011-05-26 02:58:39 UTC
Your reply came through OK.  bugzilla can be a bit slow at times.  Emailed reply-to-all is the right thing to do, thanks.
Comment 7 Anonymous Emailer 2011-05-27 05:59:42 UTC
Reply-To: ja@ssi.bg

	Hello,

On Thu, 26 May 2011, Victor Mataré wrote:

> Examining the host which now has 137.226.164.2 (used to have 137.226.164.13):
> 
> # ip addr show dev eth0
> 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen
> 1000
>      link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
>      inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
>      inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
> [...]
> 
> Sorry, got confused with all the swapping. I'm *not* keeping the old address
> around, it's completely *gone*, from both ifconfig and ip. But still it's
> being used as arp src address. That's what this bug is about. Sorry for the
> confusion.

	It looks strange. Can you confirm the following things:

- the kernel version

- the order of 'ip' command used to add and change IPs on this box

- output of 'ip route list table local' after IPs are changed and
before starting arping

- output of 'strace arping', I assume it is using getsockname
after UDP connect

- any reason to use broadcast 137.226.164.255 for all addresses?

Regards

--
Julian Anastasov <ja@ssi.bg>
Comment 8 Victor Mataré 2011-05-28 00:29:55 UTC
On Friday, 27.05.2011 07:27:23 Julian Anastasov wrote:
> 
>       Hello,
> 
> On Thu, 26 May 2011, Victor Mataré wrote:
> 
> > Examining the host which now has 137.226.164.2 (used to have
> 137.226.164.13):
> > 
> > # ip addr show dev eth0
> > 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen
> 1000
> >      link/ether 00:e0:81:41:1f:e4 brd ff:ff:ff:ff:ff:ff
> >      inet 137.226.164.2/24 brd 137.226.164.255 scope global eth0
> >      inet 192.168.23.2/24 brd 137.226.164.255 scope global eth0:0
> > [...]
> > 
> > Sorry, got confused with all the swapping. I'm *not* keeping the old
> address around, it's completely *gone*, from both ifconfig and ip. But still
> it's being used as arp src address. That's what this bug is about. Sorry for
> the confusion.
> 
>       It looks strange. Can you confirm the following things:
> 
> - the kernel version

This host runs 2.6.36-hardened-r9. I'm not sure which vanilla release that's based on, but it's patched with grsec and PAX. However another host which exhibits the exact same behaviour runs 2.6.29-gentoo-r5. This one does not have hardened or grsec, but gentoo patches, so I'd assume this is neither a version- nor a patch-specific problem.

> 
> - the order of 'ip' command used to add and change IPs on this box

ok - starting situation was 2 IPs: 137.226.164.13/24 (eth0) and 192.168.23.13/24 (eth0:0)
then I did "ifconfig eth0 137.226.164.2 netmask 255.255.255.0"
I'm not exactly sure what happened then, but the result was that "ip addr show dev eth0" showed that eth0 still had the old IP address, while ifconfig didn't. Ifconfig was misbehaving in some kind of way, that's why I checked the situation with the ip tool. Then I used ip to configure everything as intended and now I have the situation described in this bug. Note that the server has been in productive use for a week now despite of that.

> 
> - output of 'ip route list table local' after IPs are changed and
> before starting arping

broadcast 127.255.255.255 dev lo  proto kernel  scope link  src 127.0.0.1 
broadcast 192.168.23.0 dev eth0  proto kernel  scope link  src 192.168.23.2 
local 192.168.23.2 dev eth0  proto kernel  scope host  src 192.168.23.2 
local 137.226.164.2 dev eth0  proto kernel  scope host  src 137.226.164.2 
local 137.226.164.13 dev eth0  proto kernel  scope host  src 137.226.164.13 
broadcast 192.168.23.255 dev eth0  proto kernel  scope link  src 192.168.23.2 
broadcast 137.226.164.255 dev eth0  proto kernel  scope link  src 137.226.164.2 
broadcast 137.226.164.255 dev eth0  proto kernel  scope link  src 192.168.23.2 
broadcast 127.0.0.0 dev lo  proto kernel  scope link  src 127.0.0.1 
local 127.0.0.1 dev lo  proto kernel  scope host  src 127.0.0.1 
local 127.0.0.0/8 dev lo  proto kernel  scope host  src 127.0.0.1 

I guess that entry "local 137.226.164.13" shouldn't be there? But shouldn't that be removed automatically when I delete the IP from eth0?

> 
> - output of 'strace arping', I assume it is using getsockname
> after UDP connect

# strace arping 137.226.164.13
[...]
socket(PF_PACKET, SOCK_DGRAM, 0)        = 3
setuid(0)                               = 0
ioctl(3, SIOCGIFINDEX, {ifr_name="eth0", ifr_index=4}) = 0
ioctl(3, SIOCGIFFLAGS, {ifr_name="eth0", ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "eth0\0", 5) = 0
setsockopt(4, SOL_SOCKET, SO_DONTROUTE, [1], 4) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("137.226.164.13")}, 16) = 0
getsockname(4, {sa_family=AF_INET, sin_port=htons(44125), sin_addr=inet_addr("137.226.164.13")}, [16]) = 0
close(4)                                = 0
bind(3, {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, addr(0)={0, }, 128) = 0
getsockname(3, {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, addr(6)={1, 00e081411fe4}, [18]) = 0
[...] no reply [...]

compare that with:

# strace arping 137.226.164.3
[...]
socket(PF_PACKET, SOCK_DGRAM, 0)        = 3
setuid(0)                               = 0
ioctl(3, SIOCGIFINDEX, {ifr_name="eth0", ifr_index=4}) = 0
ioctl(3, SIOCGIFFLAGS, {ifr_name="eth0", ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "eth0\0", 5) = 0
setsockopt(4, SOL_SOCKET, SO_DONTROUTE, [1], 4) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("137.226.164.3")}, 16) = 0
getsockname(4, {sa_family=AF_INET, sin_port=htons(45467), sin_addr=inet_addr("137.226.164.2")}, [16]) = 0
close(4)                                = 0
bind(3, {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, addr(0)={0, }, 128) = 0
getsockname(3, {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, addr(6)={1, 00e081411fe4}, [18]) = 0
[...] reply [...]

So that's the change in source address, and I guess it's due to the table above? Then this is more like a bug in the "ip" utility?

> 
> - any reason to use broadcast 137.226.164.255 for all addresses?

Nope, none at all. I didn't see that because I thought ifconfig and ip use sensible defaults. Well...

So thanks, looks like you're pointing in the right direction.

Victor
Comment 9 Anonymous Emailer 2011-05-28 19:09:30 UTC
Reply-To: ja@ssi.bg

	Hello,

On Sat, 28 May 2011, Victor Mataré wrote:

> ok - starting situation was 2 IPs: 137.226.164.13/24 (eth0) and
> 192.168.23.13/24 (eth0:0)
> then I did "ifconfig eth0 137.226.164.2 netmask 255.255.255.0"
> I'm not exactly sure what happened then, but the result was that "ip addr
> show dev eth0" showed that eth0 still had the old IP address, while ifconfig
> didn't. Ifconfig was misbehaving in some kind of way, that's why I checked
> the situation with the ip tool. Then I used ip to configure everything as
> intended and now I have the situation described in this bug. Note that the
> server has been in productive use for a week now despite of that.
> 
> > 
> > - output of 'ip route list table local' after IPs are changed and
> > before starting arping
> 
> broadcast 127.255.255.255 dev lo  proto kernel  scope link  src 127.0.0.1 
> broadcast 192.168.23.0 dev eth0  proto kernel  scope link  src 192.168.23.2 
> local 192.168.23.2 dev eth0  proto kernel  scope host  src 192.168.23.2 
> local 137.226.164.2 dev eth0  proto kernel  scope host  src 137.226.164.2 
> local 137.226.164.13 dev eth0  proto kernel  scope host  src 137.226.164.13 
> broadcast 192.168.23.255 dev eth0  proto kernel  scope link  src 192.168.23.2 
> broadcast 137.226.164.255 dev eth0  proto kernel  scope link  src
> 137.226.164.2 
> broadcast 137.226.164.255 dev eth0  proto kernel  scope link  src
> 192.168.23.2 
> broadcast 127.0.0.0 dev lo  proto kernel  scope link  src 127.0.0.1 
> local 127.0.0.1 dev lo  proto kernel  scope host  src 127.0.0.1 
> local 127.0.0.0/8 dev lo  proto kernel  scope host  src 127.0.0.1 
> 
> I guess that entry "local 137.226.164.13" shouldn't be there? But shouldn't
> that be removed automatically when I delete the IP from eth0?

	Yes, this problem looks like what we fixed recently:

http://marc.info/?l=linux-netdev&m=129848300922970&w=2
http://marc.info/?l=linux-netdev&m=130048961407666&w=2
http://marc.info/?l=linux-netdev&m=130057251901164&w=2

	It can happen only when you add 137.226.164.13 many
times with different subnet mask at the same time,
eg. /32 and /24.

	To understand what really happens for your setup
we should try commands that reproduce the problem, eg.
on some unused device such as eth1 or dummy0. The first
link has such test script as example. Leaving such routes
should be reproducible.

Regards

--
Julian Anastasov <ja@ssi.bg>

Note You need to log in before you can comment on or make changes to this bug.