Bug 202249
Summary: | WARNING: CPU: 1 PID: 3904 at net/ipv4/raw_diag.c:28 raw_get_hashinfo.isra.0+0x1a/0x30 | ||
---|---|---|---|
Product: | Networking | Reporter: | Martin Zaharinov (micron10) |
Component: | IPV4 | Assignee: | Stephen Hemminger (stephen) |
Status: | NEW --- | ||
Severity: | high | CC: | howaboutsynergy, nanericwang |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 4.20.2 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Martin Zaharinov
2019-01-13 09:52:12 UTC
I just hit this with kernel-stable from repo. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git commit b724e93564045d97ec1179d3acacdac60e658569 (HEAD -> makepkg, tag: v5.1.1, origin/linux-5.1.y) Date: Sat May 11 07:49:56 2019 +0200 Linux 5.1.1 ``` [ 0.000000] microcode: microcode updated early to revision 0xaa, date = 2018-12-12 [ 0.000000] Linux version 5.1.1-gb724e9356404 (user@i87k) (gcc version 8.3.0 (GCC)) #2 SMP Sun May 12 04:05:34 CEST 2019 [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-linux-stable root=UUID=2b8b9ab8-7ac5-4586-aa42-d7ffb12de92a rw root_trim=yes rd.luks.allow-discards rd.luks.options=discard ipv6.disable=1 ipv6.disable_ipv6=1 ipv6.autoconf=0 loglevel=15 log_buf_len=16M ignore_loglevel printk.always_kmsg_dump=y printk.time=y printk.devkmsg=on mminit_loglevel=4 memory_corruption_check=1 fbcon=scrollback:4096k fbcon=font:ProFont6x11 net.ifnames=0 pax_sanitize_slab=full nolvm dobtrfs console=tty1 earlyprintk=vga audit=0 systemd.log_target=kmsg systemd.journald.forward_to_console=1 enforcing=0 udev.children-max=1256 rd.udev.children-max=1256 nohz=on oops=panic crashkernel=128M panic=0 psi=1 sysrq_always_enabled random.trust_cpu=off logo.nologo lpj=0 mce=bootlog reboot=force,cold noexec=on nohibernate scsi_mod.use_blk_mq=1 consoleblank=120 noefi cpuidle.governor=teo zram.num_devices=3 zswap.enabled=0 zswap.same_filled_pages_enabled=1 zswap.compressor=zstd zswap.max_pool_percent=40 zswap.zpool=z3fold i915.alp [ 0.000000] KERNEL supported cpus: [ 0.000000] Intel GenuineIntel ... [ 28.395363] raw_diag: Unexpected inet family 10 [ 28.395372] WARNING: CPU: 8 PID: 1356 at net/ipv4/raw_diag.c:28 raw_get_hashinfo.isra.3+0x1a/0x30 [raw_diag] [ 28.395372] Modules linked in: raw_diag inet_diag af_packet_diag msr xt_TCPMSS iptable_mangle iptable_security iptable_nat nf_nat iptable_raw nf_log_ipv4 nf_log_common xt _owner xt_LOG xt_connlimit nf_conncount xt_conntrack nf_conntrack nf_defrag_ipv4 xt_hashlimit xt_multiport xt_addrtype snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass i915 crct10dif_pclmul snd_hda_intel i2c_algo_bit crc32_pclmul snd_hda_codec crc32c_intel drm_kms_helper snd_hwdep syscopyarea snd_hda_core sysfillrect sysimgblt snd_pcm fb_sys_fops iTCO_wdt ghash_clmulni_intel snd_timer drm iTCO_vendor_support intel_cstate snd intel_uncore mei_me intel_rapl_perf e1000e mq_deadline soundcore drm_panel_orientation_quirks pcspkr i2c_i801 mei video acpi_pad xhci_pci xhci_hcd [ 28.395384] CPU: 8 PID: 1356 Comm: ss Kdump: loaded Tainted: G U 5.1.1-gb724e9356404 #2 [ 28.395385] Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 1002 07/02/2018 [ 28.395386] RIP: 0010:raw_get_hashinfo.isra.3+0x1a/0x30 [raw_diag] [ 28.395387] Code: e8 01 89 46 3c c3 66 2e 0f 1f 84 00 00 00 00 00 48 c7 c0 20 9e 00 93 40 80 ff 02 74 16 80 3d d8 1f 00 00 00 0f 84 4c 04 00 00 <0f> 0b 48 c7 c0 ea ff ff ff c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 [ 28.395388] RSP: 0018:ffffb160099f3a58 EFLAGS: 00010246 [ 28.395389] RAX: 0000000000000023 RBX: ffff91a223e78b00 RCX: 0000000000000000 [ 28.395390] RDX: 0000000000000000 RSI: ffff91a22da15508 RDI: ffff91a22da15508 [ 28.395390] RBP: ffff91a1c8320c10 R08: 00000000000004e7 R09: 0000000000000032 [ 28.395391] R10: 0000000000000001 R11: ffffb160099f3900 R12: ffff91a1c8320c10 [ 28.395392] R13: ffff91a223d22708 R14: ffff91a223d22708 R15: 0000000000000000 [ 28.395392] FS: 00007a609c7d3740(0000) GS:ffff91a22da00000(0000) knlGS:0000000000000000 [ 28.395393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 28.395394] CR2: 00007ffe1cd4c0a0 CR3: 00000007c88d6004 CR4: 00000000003606e0 [ 28.395394] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 28.395395] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 28.395396] Call Trace: [ 28.395398] raw_diag_dump+0x3b/0x1d0 [raw_diag] [ 28.395400] __inet_diag_dump+0x3c/0x80 [inet_diag] [ 28.395402] netlink_dump+0x11e/0x370 [ 28.395403] __netlink_dump_start+0x197/0x210 [ 28.395404] inet_diag_handler_cmd+0xc8/0xe0 [inet_diag] [ 28.395406] ? inet_diag_dump_compat+0xe0/0xe0 [inet_diag] [ 28.395407] sock_diag_rcv_msg+0x10f/0x130 [ 28.395408] ? sock_diag_bind+0x50/0x50 [ 28.395409] netlink_rcv_skb+0x44/0x110 [ 28.395410] sock_diag_rcv+0x1f/0x30 [ 28.395411] netlink_unicast+0x1b2/0x250 [ 28.395412] netlink_sendmsg+0x1ff/0x3d0 [ 28.395414] sock_sendmsg+0x31/0x40 [ 28.395415] ___sys_sendmsg+0x290/0x2f0 [ 28.395417] ? ___sys_recvmsg+0x167/0x200 [ 28.395418] ? __check_object_size+0x3c/0x181 [ 28.395420] ? _copy_to_user+0x26/0x30 [ 28.395421] ? move_addr_to_user+0xb1/0xe0 [ 28.395423] __sys_sendmsg+0x52/0xa0 [ 28.395424] do_syscall_64+0x50/0x150 [ 28.395426] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 28.395427] RIP: 0033:0x7a609c8db4c8 [ 28.395428] Code: ff eb b4 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 55 fd 0b 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 60 c3 0f 1f 80 00 00 00 00 55 48 89 e5 41 55 [ 28.395429] RSP: 002b:00007ffe1ccafe38 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 28.395430] RAX: ffffffffffffffda RBX: 00007ffe1ccaffb0 RCX: 00007a609c8db4c8 [ 28.395430] RDX: 0000000000000000 RSI: 00007ffe1ccaff40 RDI: 0000000000000003 [ 28.395431] RBP: 000000000000000a R08: 0000000000007fff R09: 0000000000000000 [ 28.395431] R10: 00007a609c996250 R11: 0000000000000246 R12: 00007ffe1ccafec0 [ 28.395432] R13: 0000000000000003 R14: 00007ffe1ccb0000 R15: 00007ffe1ccaff40 [ 28.395433] ---[ end trace 33a830973ab9aad3 ]--- ``` I'm on ArchLinux So this is because `CONFIG_IPV6` is disabled in kernel `.config` ``` include/linux/socket.h:172:#define AF_INET6 10 /* IP version 6 */ ``` and, as I understand it, some process tried to send IPv6 packet? this shows it happens every 1 second: ``` diff --git a/net/ipv4/raw_diag.c b/net/ipv4/raw_diag.c index c200065ef9a5..d9d9a4e3e07a 100644 --- a/net/ipv4/raw_diag.c +++ b/net/ipv4/raw_diag.c @@ -23,7 +23,8 @@ raw_get_hashinfo(const struct inet_diag_req_v2 *r) return &raw_v6_hashinfo; #endif } else { - pr_warn_once("Unexpected inet family %d\n", + //pr_warn_once("Unexpected inet family %d\n", + pr_warn("Unexpected inet family %d\n", r->sdiag_family); WARN_ON_ONCE(1); return ERR_PTR(-EINVAL); ``` and it's caused by running `ss -anptSudw0eom`, for me. /usr/bin/ss is owned by iproute2 5.0.0-1 So, `ss` tries to send some ipv6 packet(s) every second, or something... and it even shows it in itself like this: ``` Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port p_dgr UNCONN 0 0 *:* * ino:39983 sk:6 ver:1 cpy_thresh:0 flags( running auxdata ) ring_rx(blk_size:524288,blk_nr:8,frm_size:262240,frm_nr:8,tmo:0,features:0x0) skmem:(r0,rb1048576,t0,tb1048576,f0,w0,o96,bl0,d0) ``` ``` -a, --all Display both listening and non-listening (for TCP this means established connections) sockets. -n, --numeric Do not try to resolve service names. -p, --processes Show process using socket. -t, --tcp Display TCP sockets. -S, --sctp Display SCTP sockets. -u, --udp Display UDP sockets. -d, --dccp Display DCCP sockets. -w, --raw Display RAW sockets. -0, --packet Display PACKET sockets (alias for -f link). -e, --extended Show detailed socket information. The output format is: uid:<uid_number> ino:<inode_number> sk:<cookie> <uid_number> the user id the socket belongs to <inode_number> the socket's inode number in VFS <cookie> an uuid of the socket -o, --options Show timer information. For tcp protocol, the output format is: timer:(<timer_name>,<expire_time>,<retrans>) <timer_name> the name of the timer, there are five kind of timer names: on: means one of these timers: tcp retrans timer, tcp early retrans timer and tail loss probe timer keepalive: tcp keep alive timer timewait: timewait stage timer persist: zero window probe timer unknown: none of the above timers <expire_time> how long time the timer will expire <retrans> how many times the retran occurs -m, --memory Show socket memory usage. The output format is: skmem:(r<rmem_alloc>,rb<rcv_buf>,t<wmem_alloc>,tb<snd_buf>, f<fwd_alloc>,w<wmem_queued>, o<opt_mem>,bl<back_log>) <rmem_alloc> the memory allocated for receiving packet <rcv_buf> the total memory can be allocated for receiving packet <wmem_alloc> the memory used for sending packet (which has been sent to layer 3) <snd_buf> the total memory can be allocated for sending packet <fwd_alloc> the memory allocated by the socket as cache, but not used for receiving/sending packet yet. If need memory to send/receive packet, the memory in this cache will be used before allocate additional memory. <wmem_queued> The memory allocated for sending packet (which has not been sent to layer 3) <opt_mem> The memory used for storing socket option, e.g., the key for TCP MD5 signature <back_log> The memory used for the sk backlog queue. On a process context, if the process is receiving packet, and a new packet is received, it will be put into the sk backlog queue, so it can be received by the process immediately ``` To be clear, I meant, running `ss` every second like this: ``` while true; do ss -anptSudw0eom --vsock #^ thanks to dreisner on freenode for suggesting 'ss' instead of the ancient 'netstat' echo '------------' sleep 1 done ``` Actually I should've ran it as `sudo ss` to see that it was `tcpdump` causing the above output inside `ss`: ``` Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port p_dgr UNCONN 0 0 *:* * users:(("tcpdump",pid=4895,fd=3)) ino:65078 sk:a ver:1 cpy_thresh:0 flags( running auxdata ) ring_rx(blk_size:524288,blk_nr:8,frm_size:262240,frm_nr:8,tmo:0,features:0x0) skmem:(r0,rb1048576,t0,tb1048576,f0,w0,o96,bl0,d0) ------------ ``` Not running `tcpdump` at all, still causes the `raw_diag: Unexpected inet family 10` every 1 second because it was indeed causes by running `ss` every second. So, in conclusion, something (in my case `ss`) was trying to send out an ipv6 packet in a kernel compiled with unset `CONFIG_IPV6` and it is reported only once, per boot(session), by default! PS: `ss -n` or `ss -a` is enough to trigger that. Actually it's `ss -w`(raw, displa RAW sockets) that causes it. Looking back at OP, it was `ss` for them too: `[ 173.185512] CPU: 1 PID: 3904 Comm: ss Tainted: G O 4.20.1 #1` so in my case just removing `-w` gets rid of the dmesg message, ie. `sudo ss -anptSudw0eom --vsock` became: `sudo ss -anptSud0eom --vsock` but note that specifying just `ss -a` alone will imply `-w`, but the above("`sudo ss -anptSud0eom --vsock`" that is) somehow doesn't(imply `-w`). ``` ┌── INET: socket monitoring interface ────────────────────────────────────────┐ │ │ │ CONFIG_INET_DIAG: │ │ │ │ Support for INET (TCP, DCCP, etc) socket monitoring interface used by │ │ native Linux tools such as ss. ss is included in iproute2, currently │ │ downloadable at: │ │ │ │ http://www.linuxfoundation.org/collaborate/workgroups/networking/iproute2 │ │ │ │ If unsure, say Y. │ │ │ │ Symbol: INET_DIAG [=m] │ │ Type : tristate │ │ Prompt: INET: socket monitoring interface │ │ Location: │ │ -> Networking support (NET [=y]) │ │ -> Networking options │ │ -> TCP/IP networking (INET [=y]) │ │ Defined at net/ipv4/Kconfig:426 │ │ Depends on: NET [=y] && INET [=y] │ │ │ │ │ │ │ │ <OK> │ └─────────────────────────────────────────────────────────────────────────────┘ ``` that file `net/ipv4/raw_diag.c` is under `CONFIG_INET_RAW_DIAG` so this would be another way to get rid of seeing the message. ``` ┌── RAW: socket monitoring interface ─────────────────────────────────────────────────┐ │ │ │ CONFIG_INET_RAW_DIAG: │ │ │ │ Support for RAW socket monitoring interface used by the ss tool. │ │ If unsure, say Y. │ │ │ │ Symbol: INET_RAW_DIAG [=m] │ │ Type : tristate │ │ Prompt: RAW: socket monitoring interface │ │ Location: │ │ -> Networking support (NET [=y]) │ │ -> Networking options │ │ -> TCP/IP networking (INET [=y]) │ │ -> INET: socket monitoring interface (INET_DIAG [=m]) │ │ Defined at net/ipv4/Kconfig:450 │ │ Depends on: NET [=y] && INET [=y] && INET_DIAG [=m] && (IPV6 [=n] || IPV6 [=n]=n) │ │ │ │ │ │ │ │ <OK> │ └─────────────────────────────────────────────────────────────────────────────────────┘ ``` I suspect this has already been fixed. Please try and reproduce this with a supported kernel 4.19.42, 5.0.15, or 5.1.1 The 4.20 kernel is not supported by upstream developers. If you are using a distribution kernel please report it to them. The issue can still be reproduced with 5.1.15.
> Jun 29 08:12:38 kernel: raw_diag: Unexpected inet family 10
> Jun 29 08:12:38 kernel: WARNING: CPU: 0 PID: 5182 at 0xffffffffa021a03a
> Jun 29 08:12:38 kernel: Modules linked in: raw_diag inet_diag unix_diag evdev
> hv_netvsc hv_utils ptp pps_core cn hyperv_keyboard serio 8250 hv_balloon
> ucs2_str>
> Jun 29 08:12:38 kernel: CPU: 0 PID: 5182 Comm: ss Not tainted 5.1.15-1-ck #2
> Jun 29 08:12:38 kernel: Hardware name: Microsoft Corporation Virtual
> Machine/Virtual Machine, BIOS 090007 06/02/2017
> Jun 29 08:12:38 kernel: RIP: 0010:0xffffffffa021a03a
> Jun 29 08:12:38 kernel: Code: c8 89 46 3c c3 66 66 2e 0f 1f 84 00 00 00 00 00
> 48 c7 c0 20 9d f1 81 40 80 ff 02 74 16 80 3d d4 1f 00 00 00 0f 84 bc 03 00 00
> <0f>
> Jun 29 08:12:38 kernel: RSP: 0018:ffffc9000058ba10 EFLAGS: 00010246
> Jun 29 08:12:38 kernel: RAX: 0000000000000023 RBX: 0000000000000000 RCX:
> 0000000000000000
> Jun 29 08:12:38 kernel: RDX: 0000000000000000 RSI: ffff8881bba14f78 RDI:
> ffff8881bba14f78
> Jun 29 08:12:38 kernel: RBP: ffff8881a64d0900 R08: ffff8881bba14f78 R09:
> 0000000000000004
> Jun 29 08:12:38 kernel: R10: 0000000000000000 R11: 0000000000000001 R12:
> ffff8881a5e69010
> Jun 29 08:12:38 kernel: R13: ffff8881b8d7e2f8 R14: ffff8881a64d0900 R15:
> ffff8881b8d7e2f8
> Jun 29 08:12:38 kernel: FS: 00007f07d1f5a740(0000) GS:ffff8881bba00000(0000)
> knlGS:0000000000000000
> Jun 29 08:12:38 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Jun 29 08:12:38 kernel: CR2: 000055be2322a188 CR3: 00000001a896e004 CR4:
> 00000000001606b0
as per my previous comments, this is just the kernel telling you that some program (`ss`) was the first to try to send an ipv6 packet while you have ipv6 support disabled in your kernel(probably not built in? or does the same happen with ipv6.disable=1 in kernel /proc/cmdline? unsure) not technically an issue, so nothing to fix really. IPv6 stack is not built in my kernel, because i don't use IPv6. Also, everything else works fine. If you want to ignore this message see Comment 3 and just comment out those: ``` pr_warn_once("Unexpected inet family %d\n", r->sdiag_family); ``` and ``` WARN_ON_ONCE(1); ``` in file `net/ipv4/raw_diag.c` You can do this since you're building your own kernel. As you can see it's not a bug, just kernel printing some info once, only the first time it happens (it = when something tries to send an ipv6 packet ) The `WARN_ON_ONCE(1)` is the one that dumps the stacktrace. |