Bug 201243

Summary: aarch64 dnf makecache causes orangepi prime unable to handle kernel paging request after running for about 10 hours
Product: Platform Specific/Hardware Reporter: Ziqian SUN (Zamir) (sztsian)
Component: ARMAssignee: linux-arm-kernel (linux-arm-kernel)
Status: NEW ---    
Severity: normal CC: pbrobinson
Priority: P1    
Hardware: ARM   
OS: Linux   
See Also: https://bugzilla.redhat.com/show_bug.cgi?id=1628574
Kernel Version: kernel-4.18.5-200.fc28.aarch64 Subsystem:
Regression: No Bisected commit-id:

Description Ziqian SUN (Zamir) 2018-09-26 13:28:27 UTC
Description of problem:
Unable to handle kernel paging request happens after running OrangePi Prime for about 10 hours. It seems disable the dnf makecache timer is the root reproducer.

Version-Release number of selected component (if applicable):
Fedoa 28 aarch64

How reproducible:

Steps to Reproduce:
1. Power on OrangePi Prime with the given kernel, then just let it there to wait for dnf makecache automatically run for several times

Actual results:
System will hang with unable to handle kernel paging request on console
(There are a lot of "wlan0: link is not ready" before and after the oops which I did not paste here)

[36630.569110] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[36945.596633] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[37124.218855] Unable to handle kernel paging request at virtual address ffff000009687dd8
[37124.226850] Mem abort info:
[37124.229645]   ESR = 0x96000007
[37124.232700]   Exception class = DABT (current EL), IL = 32 bits
[37124.238655]   SET = 0, FnV = 0
[37124.241715]   EA = 0, S1PTW = 0
[37124.244855] Data abort info:
[37124.247753]   ISV = 0, ISS = 0x00000007
[37124.251589]   CM = 0, WnR = 0
[37124.254562] swapper pgtable: 4k pages, 48-bit VAs, pgdp = 0000000014395a12
[37124.261447] [ffff000009687dd8] pgd=00000000bfffe803, pud=00000000bfffd803, pmd=00000000bfff9803, pte=00f8000041687f13
[37124.272077] Internal error: Oops: 96000007 [#1] SMP
[37124.276955] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables vfat fat realtek snd_soc_hdmi_codec rc_cec dw_hdmi_cec dw_hdmi_i2s_audio sun8i_codec_analog sun4i_codec dwmac_sun8i r8723bs(C) stmmac_platform snd_soc_core stmmac mdio_mux ac97_bus of_mdio snd_pcm_dmaengine sun8i_drm_hdmi fixed_phy snd_pcm libphy dw_hdmi snd_timer crc32_ce cec crct10dif_ce snd sunxi_cir sun4i_drm ghash_ce rc_core sun4i_frontend sun8i_mixer sun4i_tcon soundcore
[37124.347855]  cfg80211 drm_kms_helper sunxi_wdt drm rfkill fb_sys_fops syscopyarea sysfillrect leds_gpio sysimgblt mmc_block sunxi phy_generic musb_hdrc udc_core ohci_platform ehci_platform sunxi_mmc phy_sun4i_usb gpio_keys
[37124.367625] CPU: 3 PID: 2538 Comm: dnf Tainted: G         C        4.18.5-200.fc28.aarch64 #1
[37124.376148] Hardware name: sunxi sunxi/sunxi, BIOS 2018.03 04/15/2018
[37124.382589] pstate: 20400005 (nzCv daif +PAN -UAO)
[37124.387395] pc : memblock_is_map_memory+0x24/0xa0
[37124.392105] lr : pfn_valid+0x20/0x30
[37124.395679] sp : ffff00000e033c60
[37124.398993] x29: ffff00000e033c60 x28: ffff8000763a8000 
[37124.404308] x27: ffff000008a42000 x26: 000000000000004f 
[37124.409620] x25: 00000000000007ff x24: ffff00000e033e38 
[37124.414924] x23: 0000000000000000 x22: ffff800074cbb000 
[37124.420236] x21: 0000000074cba020 x20: 0000000000000fe0 
[37124.425550] x19: 00000000b4cba000 x18: 000000000000023f 
[37124.430863] x17: 0000000000000000 x16: 0000000000000000 
[37124.436176] x15: 0000000000000000 x14: 000000000000002b 
[37124.441490] x13: 736e6967756c702d x12: 666e642f73656761 
[37124.446803] x11: 6b6361702d657469 x10: 732f362e336e6f68 
[37124.452117] x9 : 736567616b636170 x8 : ffff800074cbe000 
[37124.457430] x7 : ffff800074cba000 x6 : 0000000000004000 
[37124.462743] x5 : 0000000000015831 x4 : 0000ffffffffffff 
[37124.468055] x3 : 0000000000000000 x2 : 0000000000000000 
[37124.473368] x1 : 0000000000000fe0 x0 : ffff000009687dc8 
[37124.478684] Process dnf (pid: 2538, stack limit = 0x00000000956e6cc5)
[37124.485120] Call trace:
[37124.487571]  memblock_is_map_memory+0x24/0xa0
[37124.491928]  pfn_valid+0x20/0x30
[37124.495162]  __check_object_size+0x68/0x1e0
[37124.499351]  strncpy_from_user+0x48/0x368
[37124.503362]  getname_flags+0x6c/0x1b0
[37124.507026]  user_path_at_empty+0x40/0x78
[37124.511039]  vfs_statx+0x80/0xe0
[37124.514269]  sys_newfstatat+0x40/0x68
[37124.517934]  el0_svc_naked+0x30/0x34
[37124.521515] Code: d503201f 90009fa0 91372000 52800003 (b9401002) 
[37124.527610] ---[ end trace 68c59fb7d3d25bfc ]---
[37260.624594] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready

Expected results:
System should work fine

Additional info:
With dnf makecache service disabled, this will not happen again.

I also talked to one of the linux-sunxi kernel developer, she said kernel should not oops no matter how bad the userspace is coded. So I report it here.