Bug 201791

Summary: reference to netfilter chain not removed on rule replacement, subsequently system hangs
Product: Networking Reporter: Christoph Anton Mitterer (calestyo)
Component: Netfilter/IptablesAssignee: networking_netfilter-iptables (networking_netfilter-iptables)
Status: NEW ---    
Severity: high    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.18.20 Subsystem:
Regression: No Bisected commit-id:

Description Christoph Anton Mitterer 2018-11-27 01:19:18 UTC
The following is from my original report at: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914505

btw: I'd guess this is a regression, but not totally sure... at least I've never noticed this behaviour before.



Hi.

Possibly the following may be also partially iptables (i.e. the userland tool) fault.

I'm using fail2ban with some custom usage mode, which is that the hook-rule
for fail2ban's change isn't just appended somehwere, but an inserted at just
the right point in my iptables rules (loaded at boot by netfilter-persistent).

This looks e.g. like the following in terms of rules:
...
-A INPUT	--in-interface lo  -m comment  --comment "f2b-hook-sshd"
-A INPUT	--destination 0.ssh.srv.localhost  --protocol tcp  -m tcp  --destination-port ssh --syn	-j ACCEPT
...
(where the first rule servers as a dummy rule)

And an /etc/fail2ban/action.d/iptables-multiport.conf which looks like:
...
actionstart = <iptables> -N f2b-<name>
              <iptables> -A f2b-<name> -j <returntype>
              rulenum="$( <iptables> -L <chain> --line-numbers  |  grep '/\* f2b-hook-<name> \*/'  |  cut -d ' ' -f 1 )"
              <iptables> -R <chain> "${rulenum}" -p <protocol> -m multiport --dports <port> -j f2b-<name>
...
actionstop = rulenum="$( <iptables> -L <chain> --line-numbers  |  grep f2b-<name>  |  cut -d ' ' -f 1 )"
             <iptables> -R <chain> "${rulenum}" --in-interface lo -m comment --comment f2b-hook-<name>
             <iptables> -F f2b-<name>
             <iptables> -X f2b-<name>
...

So far so good.


When fail2ban starts I get something like this:
# iptables -L
Chain INPUT (policy DROP)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             state RELATED,ESTABLISHED
ACCEPT     icmp --  anywhere             anywhere            
DROP       all  --  anywhere             anywhere             state INVALID,UNTRACKED
f2b-sshd   tcp  --  anywhere             anywhere             multiport dports ssh
REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable
...
Chain f2b-sshd (1 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere            


Everything fine.


But now:


1) Replacing the rule that causes the reference to f2b-sshd doesn't clear the reference.
Now when I stop fail2ban it will do something like:
iptables -R INPUT 5 --in-interface lo -m comment --comment f2b-hook-<name>
i.e. bringing me back the original dummy rule, but here some error happens on either
iptable or the kernel or both:
# iptables -L
Chain INPUT (policy DROP)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             state RELATED,ESTABLISHED
ACCEPT     icmp --  anywhere             anywhere            
DROP       all  --  anywhere             anywhere             state INVALID,UNTRACKED
           all  --  anywhere             anywhere             /* f2b-hook-ssh */
REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable
...
Chain f2b-sshd (1 references)
target     prot opt source               destination         

The dummy rule in INPUT brought back, the chain f2b-sshd is flushed but left back
with reference set to 1, which is obviously wrong, as the rule no longer
references the queue.

This also happens when just calling the iptables commands manually.
It does not happen when e.g. deleting the rules (iptables -D) as fail2ban would
do per default.


If I repeat this multiple times, I can make the references even count up, e.g.:
Chain f2b-sshd (2 references)
target     prot opt source               destination         



2) The kernel is now in state from which it cannot recover,...
it seems.

It doesn't seem possible to be possible to get the broken chain
away... including when I deleted the rule that was replaced (better
said its replacement).

When I try to start from scratch with e.g.
# iptables-restore < /etc/iptables/rules.v4
The process hangs and I get a:
Nov 24 03:00:01 heisenberg kernel: [ 9857.115308] ------------[ cut here ]------------
Nov 24 03:00:01 heisenberg kernel: [ 9857.115320] kernel BUG at /build/linux-iActNR/linux-4.18.10/net/netfilter/nf_tables_api.c:1364!
Nov 24 03:00:01 heisenberg kernel: [ 9857.115367] invalid opcode: 0000 [#1] SMP PTI
Nov 24 03:00:01 heisenberg kernel: [ 9857.115379] CPU: 3 PID: 17642 Comm: iptables-restor Not tainted 4.18.0-2-amd64 #1 Debian 4.18.10-2
Nov 24 03:00:01 heisenberg kernel: [ 9857.115382] Hardware name: FUJITSU LIFEBOOK U757/FJNB2A5, BIOS Version 1.21 03/19/2018
Nov 24 03:00:01 heisenberg kernel: [ 9857.115412] RIP: 0010:nf_tables_chain_destroy.isra.48+0x95/0xa0 [nf_tables]
Nov 24 03:00:01 heisenberg kernel: [ 9857.115414] Code: 51 bf ab d8 48 8b 7b 58 e8 78 5b b3 d8 48 89 ef 5b 5d e9 6e 5b b3 d8 48 8b 7b 58 e8 65 5b b3 d8 48 89 df 5b 5d e9 5b 5b b3 d8 <0f> 0b 0f 0b eb 9c 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 07 8b 90 
Nov 24 03:00:01 heisenberg kernel: [ 9857.115450] RSP: 0018:ffffa6c70aaf3998 EFLAGS: 00010202
Nov 24 03:00:01 heisenberg kernel: [ 9857.115454] RAX: 0000000000000001 RBX: ffffffff9a2dafc0 RCX: dead000000000200
Nov 24 03:00:01 heisenberg kernel: [ 9857.115456] RDX: ffff99a0758d3cc0 RSI: ffff99a18a8fa980 RDI: ffff99a22927ef00
Nov 24 03:00:01 heisenberg kernel: [ 9857.115457] RBP: ffff99a0758d3cc0 R08: 0000000000000000 R09: ffffffffc08ff600
Nov 24 03:00:01 heisenberg kernel: [ 9857.115459] R10: ffff99a18a8fae00 R11: 0000000000000001 R12: dead000000000200
Nov 24 03:00:01 heisenberg kernel: [ 9857.115461] R13: dead000000000100 R14: ffff99a18a8fa980 R15: ffffffff9a2dc220
Nov 24 03:00:01 heisenberg kernel: [ 9857.115464] FS:  00007fb5a8b28b80(0000) GS:ffff99a25dd80000(0000) knlGS:0000000000000000
Nov 24 03:00:01 heisenberg kernel: [ 9857.115466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 24 03:00:01 heisenberg kernel: [ 9857.115467] CR2: 00005600467ffda4 CR3: 000000058cfa6005 CR4: 00000000003606e0
Nov 24 03:00:01 heisenberg kernel: [ 9857.115470] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 24 03:00:01 heisenberg kernel: [ 9857.115472] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 24 03:00:01 heisenberg kernel: [ 9857.115473] Call Trace:
Nov 24 03:00:01 heisenberg kernel: [ 9857.115495]  nf_tables_commit+0xd13/0x1110 [nf_tables]
Nov 24 03:00:01 heisenberg kernel: [ 9857.115516]  nfnetlink_rcv_batch+0x562/0x6d0 [nfnetlink]
Nov 24 03:00:01 heisenberg kernel: [ 9857.115538]  ? kmem_cache_alloc_node_trace+0x1b0/0x1e0
Nov 24 03:00:01 heisenberg kernel: [ 9857.115549]  ? alloc_vmap_area+0x7c/0x360
Nov 24 03:00:01 heisenberg kernel: [ 9857.115553]  ? __insert_vmap_area+0x99/0x100
Nov 24 03:00:01 heisenberg kernel: [ 9857.115562]  ? refcount_inc+0x5/0x30
Nov 24 03:00:01 heisenberg kernel: [ 9857.115571]  ? apparmor_capable+0x72/0xb0
Nov 24 03:00:01 heisenberg kernel: [ 9857.115580]  ? security_capable+0x35/0x50
Nov 24 03:00:01 heisenberg kernel: [ 9857.115587]  ? nla_parse+0x32/0x100
Nov 24 03:00:01 heisenberg kernel: [ 9857.115592]  nfnetlink_rcv+0x11e/0x13c [nfnetlink]
Nov 24 03:00:01 heisenberg kernel: [ 9857.115604]  netlink_unicast+0x1c2/0x250
Nov 24 03:00:01 heisenberg kernel: [ 9857.115609]  netlink_sendmsg+0x2c1/0x3b0
Nov 24 03:00:01 heisenberg kernel: [ 9857.115620]  sock_sendmsg+0x36/0x40
Nov 24 03:00:01 heisenberg kernel: [ 9857.115626]  ___sys_sendmsg+0x2a0/0x2f0
Nov 24 03:00:01 heisenberg kernel: [ 9857.115639]  ? filemap_map_pages+0x385/0x3a0
Nov 24 03:00:01 heisenberg kernel: [ 9857.115642]  ? refcount_inc+0x5/0x30
Nov 24 03:00:01 heisenberg kernel: [ 9857.115650]  ? apparmor_capable+0x72/0xb0
Nov 24 03:00:01 heisenberg kernel: [ 9857.115655]  ? security_capable+0x35/0x50
Nov 24 03:00:01 heisenberg kernel: [ 9857.115660]  ? __sys_sendmsg+0x5e/0xa0
Nov 24 03:00:01 heisenberg kernel: [ 9857.115665]  __sys_sendmsg+0x5e/0xa0
Nov 24 03:00:01 heisenberg kernel: [ 9857.115677]  do_syscall_64+0x55/0x110
Nov 24 03:00:01 heisenberg kernel: [ 9857.115688]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 24 03:00:01 heisenberg kernel: [ 9857.115696] RIP: 0033:0x7fb5a8e36354
Nov 24 03:00:01 heisenberg kernel: [ 9857.115697] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 48 8d 05 91 36 0c 00 8b 00 85 c0 75 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 41 54 55 41 89 d4 53 48 89 f5 
Nov 24 03:00:01 heisenberg kernel: [ 9857.115733] RSP: 002b:00007fff63f0ea38 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
Nov 24 03:00:01 heisenberg kernel: [ 9857.115736] RAX: ffffffffffffffda RBX: 00007fff63f0ea50 RCX: 00007fb5a8e36354
Nov 24 03:00:01 heisenberg kernel: [ 9857.115739] RDX: 0000000000000000 RSI: 00007fff63f0fad0 RDI: 0000000000000003
Nov 24 03:00:01 heisenberg kernel: [ 9857.115740] RBP: 00007fff63f10150 R08: 0000000000000004 R09: 00007fb5a8af6f40
Nov 24 03:00:01 heisenberg kernel: [ 9857.115742] R10: 00007fff63f0fabc R11: 0000000000000246 R12: 00005628c5276740
Nov 24 03:00:01 heisenberg kernel: [ 9857.115744] R13: 00007fff63f12a20 R14: 00007fff63f0ea40 R15: 00007fff63f12a58
Nov 24 03:00:01 heisenberg kernel: [ 9857.115747] Modules linked in: udp_diag tcp_diag inet_diag nft_chain_route_ipv4 xt_CHECKSUM nft_chain_nat_ipv4 ipt_MASQUERADE nf_nat_ipv4 nf_nat tun bridge stp llc ctr ccm fuse devlink ebtable_filter ebtables cpufreq_userspace cpufreq_powersave cpufreq_conservative arc4 snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support intel_rapl snd_hda_codec_realtek nf_conntrack_ipv6 nf_defrag_ipv6 x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_generic xt_tcpudp kvm_intel iwlmvm snd_soc_skl snd_soc_skl_ipc ip6t_REJECT snd_soc_sst_ipc nf_reject_ipv6 snd_soc_sst_dsp kvm snd_hda_ext_core irqbypass snd_soc_acpi mac80211 crct10dif_pclmul snd_soc_core crc32_pclmul snd_compress btusb btrtl snd_hda_intel btbcm btintel snd_hda_codec ghash_clmulni_intel bluetooth snd_hda_core intel_cstate iwlwifi snd_hwdep uvcvideo
Nov 24 03:00:01 heisenberg kernel: [ 9857.115815]  intel_uncore snd_pcm videobuf2_vmalloc videobuf2_memops cdc_mbim intel_rapl_perf videobuf2_v4l2 snd_timer cdc_wdm videobuf2_common nf_conntrack_ipv4 cdc_ncm nf_defrag_ipv4 usbnet videodev mii snd pcspkr sdhci_pci cqhci joydev media drbg i915 soundcore sdhci ansi_cprng idma64 nft_counter mmc_core cfg80211 sg ecdh_generic drm_kms_helper crc16 rfkill mei_me intel_lpss_pci drm i2c_i801 intel_lpss xt_comment mei i2c_algo_bit ipt_REJECT nf_reject_ipv4 wmi button battery xt_multiport xt_policy xt_state xt_conntrack nf_conntrack nft_compat tpm_crb fujitsu_laptop tpm_tis tpm_tis_core sparse_keymap video tpm pcc_cpufreq acpi_pad ac rng_core nf_tables nfnetlink binfmt_misc loop parport_pc sunrpc ppdev lp parport ip_tables x_tables autofs4 dm_crypt dm_mod raid10 raid456 async_raid6_recov async_memcpy
Nov 24 03:00:01 heisenberg kernel: [ 9857.115895]  async_pq async_xor async_tx raid1 raid0 multipath linear md_mod btrfs libcrc32c crc32c_generic xor zstd_decompress zstd_compress xxhash raid6_pq uhci_hcd ehci_pci ehci_hcd usb_storage sd_mod crc32c_intel ahci libahci xhci_pci xhci_hcd aesni_intel aes_x86_64 crypto_simd libata cryptd glue_helper evdev scsi_mod psmouse serio_raw e1000e usbcore usb_common
Nov 24 03:00:01 heisenberg kernel: [ 9857.115967] ---[ end trace 78344f348b2da5ca ]---
Nov 24 03:00:01 heisenberg kernel: [ 9857.115982] RIP: 0010:nf_tables_chain_destroy.isra.48+0x95/0xa0 [nf_tables]
Nov 24 03:00:01 heisenberg kernel: [ 9857.115984] Code: 51 bf ab d8 48 8b 7b 58 e8 78 5b b3 d8 48 89 ef 5b 5d e9 6e 5b b3 d8 48 8b 7b 58 e8 65 5b b3 d8 48 89 df 5b 5d e9 5b 5b b3 d8 <0f> 0b 0f 0b eb 9c 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 07 8b 90 
Nov 24 03:00:01 heisenberg kernel: [ 9857.116015] RSP: 0018:ffffa6c70aaf3998 EFLAGS: 00010202
Nov 24 03:00:01 heisenberg kernel: [ 9857.116017] RAX: 0000000000000001 RBX: ffffffff9a2dafc0 RCX: dead000000000200
Nov 24 03:00:01 heisenberg kernel: [ 9857.116018] RDX: ffff99a0758d3cc0 RSI: ffff99a18a8fa980 RDI: ffff99a22927ef00
Nov 24 03:00:01 heisenberg kernel: [ 9857.116020] RBP: ffff99a0758d3cc0 R08: 0000000000000000 R09: ffffffffc08ff600
Nov 24 03:00:01 heisenberg kernel: [ 9857.116022] R10: ffff99a18a8fae00 R11: 0000000000000001 R12: dead000000000200
Nov 24 03:00:01 heisenberg kernel: [ 9857.116023] R13: dead000000000100 R14: ffff99a18a8fa980 R15: ffffffff9a2dc220
Nov 24 03:00:01 heisenberg kernel: [ 9857.116025] FS:  00007fb5a8b28b80(0000) GS:ffff99a25dd80000(0000) knlGS:0000000000000000
Nov 24 03:00:01 heisenberg kernel: [ 9857.116027] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 24 03:00:01 heisenberg kernel: [ 9857.116028] CR2: 00005600467ffda4 CR3: 000000058cfa6005 CR4: 00000000003606e0
Nov 24 03:00:01 heisenberg kernel: [ 9857.116030] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 24 03:00:01 heisenberg kernel: [ 9857.116032] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


Further, any networking seems dead now (probably because netfilter has said goodbye).
Cleanly rebooting also fails as systemd tries to shutdown all kinds of (now hanging)
networking stuff (including netfilter-persistent) and waits forevery during shutdown.


I'd guess this can be clearly not just an error in userland tools... or at least kernel
shouldn't allow userland to get it into such bad state.


Cheers,
Chris.