Bug 202147 - 4.20 breaks dhcp over gre6
Summary: 4.20 breaks dhcp over gre6
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV6 (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Hideaki YOSHIFUJI
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-04 20:02 UTC by Friedrich Oslage
Modified: 2024-12-06 23:37 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.20.0
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
An af packet socket to reproduce the bug (3.05 KB, text/x-csrc)
2024-12-06 23:37 UTC, Emanuele Santini
Details

Description Friedrich Oslage 2019-01-04 20:02:47 UTC
Starting with kernel 4.20.0, dhcp over gre6 no longer works.

The sendto(2) call fails with EINVAL. udhcpc output:
udhcpc: started, v1.29.3
Setting IP address 0.0.0.0 on testgre1
udhcpc: sending discover
udhcpc: sendto: Invalid argument
udhcpc: sending discover
udhcpc: sendto: Invalid argument
udhcpc: sending discover
udhcpc: sendto: Invalid argument
udhcpc: no lease, failing

After encountering this error ip(8) will hang indefinitely when trying to remove the tunnel interface. Dmesg output:
unregister_netdevice: waiting for testgre1 to become free. Usage count = 3

This is not a udhcpc issue as I am able to reproduce it using a custom dhcp client written in python(scapy). Both clients work fine with kernel 4.19.13 and older.

Steps to reproduce:
- ip -6 tunnel add testgre1 mode ip6gre remote fc00::1 local fc00:127::67
- busybox udhcpc -i testgre1 -n -f
- ip -6 tunnel del testgre1
Comment 1 Friedrich Oslage 2019-01-19 10:55:17 UTC
This bug was introduced by commit 99137b7888f4058087895d035d81c6b2d31015c5.
Reverting that commit makes dhcp over gre6 work again.

busybox set's saddr->sll_halen to 6 (length of mac address), but dev->addr_len is 16 (length of ipv6 address).

I don't think this is a bug in busybox, as their length really is 6...
Comment 2 Emanuele Santini 2024-12-06 23:37:02 UTC
Created attachment 307329 [details]
An af packet socket to reproduce the bug

This is a simple test that I make to reproduce the bug.
Comment 3 Emanuele Santini 2024-12-06 23:37:15 UTC
Hi, I would like to work on resolving this bug and have conducted an initial investigation.

I was able to reproduce this bug with a simple C program that send a packet through an ip6gre tunnel using a packet socket (family: AF_PACKET, type: SOCK_DGRAM). I have observed that the sendto system call fails with the EINVAL error. So, I confirm that this bug is not only reproduced with busybox.

In my custom test program (ipgre6_test.c that I have attached), I set up an AF_PACKET socket and performed two sendto tests on the ip6gre tunnel:

First sendto test:
 - sendto(sock, packet, sizeof(struct ipv6hdr) + strlen(payload), 0, (struct sockaddr *)&addr, sizeof(struct sockaddr_ll)); -> This will fail with EINVAL

Second sendto test:
 - sendto(sock, packet, sizeof(struct ipv6hdr) + strlen(payload), 0, (struct sockaddr *)&addr, sizeof(struct sockaddr_ll) + 10); -> This will NOT fail.

Result: This second call succeeds, and the packet is sent over the ip6gre tunnel.

The second test demonstrates that the bug can be bypassed by providing a false socket address size (in this case, adding 10 as a small arbitrary value). This suggests that the issue can be manipulated at the user level.

I proceeded with an in-depth kernel debugging session to better understand the problem. I observed that this issue is not restricted to DHCP clients; an AF_PACKET socket alone is sufficient to trigger it.

As previously noted, the problem was introduced by commit 99137b7888f4058087895d035d81c6b2d31015c5, which added a condition to the sendto system call for AF_PACKET sockets of type SOCK_DGRAM. The condition checks if the socket address provided by the user is sufficiently long to include the hardware address and other fields in sockaddr_ll. This is implemented in net/packet/af_packet.c, packet_snd function, as:

 - if (dev && msg->msg_namelen < dev->addr_len + offsetof(struct sockaddr_ll, sll_addr))

 Here:
  msg->msg_namelen is the size of the socket address provided by the user;
  dev->addr_len is the length of the hardware address;
  sockaddr_ll, represents the socket address structure for a link-layer interface. The maximum size for the hardware address accepted in sockaddr_ll is 8 bytes, and offsetof(struct sockaddr_ll, sll_addr) is 12.

For the first sendto test, the following values were observed:

 - msg->msg_namelen = 20: This is the size of the sockaddr_ll object, provided as the dest_len parameter in the sendto call.
 - dev->addr_len = 16: This value is set by the ip6gre module.

The dev->addr_len field in the net_device structure is being used improperly in the GRE implementation; According to kernel documentation, addr_len should represent the hardware address length, which, for a GRE tunnel, should be 0. However, the GRE module sets addr_len to the size of the network address (16 for ip6gre and 4 for gre).
Furthermore, I noticed an unusual line in ipv6/ip6_gre.c within the ip6gre_tunnel_init function:

 - __dev_addr_set(dev, &tunnel->parms.laddr, sizeof(struct in6_addr));

Here, __dev_addr_set is used to set the hardware address of a net_device. However, in GRE implementations (both IPv4 and IPv6), this appears to represent the network address. Shouldn't this field also be set to 0 for GRE tunnels?
If my interpretation is correct, the GRE implementation (for both IPv6 and IPv4) uses the net_device hardware address fields to store the tunnel network address. This approach is incompatible with how packet sockets interpret these fields.

Given the current GRE implementation, the most feasible solution seems to be modifying the condition in packet_snd. Specifically, I believe that AF_PACKET sockets of type SOCK_DGRAM on GRE tunnels do not need to invoke the dev_hard_header function. In such cases, we could bypass the condition. But I need to find "an elegant way" to do this.

I am also trying to understand whether using an AF_PACKET socket makes sense in the context of GRE tunneling. Any suggestions?

Note You need to log in before you can comment on or make changes to this bug.