Bug 204819
Description
Erhard F.
2019-09-11 22:47:36 UTC
Created attachment 284933 [details]
kernel .config (5.3-rc8, PowerMac G4 DP)
Could you please provide the content of /sys/kernel/debug/kernel_page_tables together with the dmesg, and also an 'objdump -h' of the failing modules ? Created attachment 284949 [details]
dmesg (5.3-rc8, PowerMac G4 DP)
Created attachment 284951 [details]
kernel_page_tables (5.3-rc8, PowerMac G4 DP)
Created attachment 284953 [details]
objdump sungem (5.3-rc8, PowerMac G4 DP)
Aaarrgghhh ! kernel_page_tables addresses are wrong. Ok, I'll manage. For next time, please apply https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/arch/powerpc/mm/ptdump?h=next-20190904&id=7c7a532ba3fc51bf9527d191fb410786c1fdc73c Created attachment 284961 [details]
Manually fixed kernel_page_tables
You get: Sep 13 17:43:49 T600 kernel: BUG: Unable to handle kernel data access at 0xfe205150 However this area is properly mapped RW: 0xfe205000-0xfe205fff 0x297d6000 4K rw present dirty accessed There must be something wrong with hash table update. I can apply the patch but how do I apply the manually fixed kernel_page_tables? Not quite sure how to proceed. No no, what I mean by 'manually fixed kernel_page_tables' is that I fixed the file you attached so that it has the same values as what it would have had with the patch applied. Don't worry. I have carefully reviewed the flushing calls and have not been able to identify any issue. Maybe a SMP issue ? Does this problem also happen without CONFIG_SMP ? Created attachment 284991 [details]
dmesg v2 (5.3-rc8 + ptdump patch)
Created attachment 284993 [details]
kernel_page_tables v2 (5.3-rc8 + ptdump patch)
Created attachment 284995 [details]
objdump firewire-ohci v2 (5.3-rc8 + ptdump patch)
Created attachment 284997 [details]
objdump usbcore v2 (5.3-rc8 + ptdump patch)
Created attachment 284999 [details]
dmesg v3 (5.3-rc8 + ptdump patch, NO SMP)
Created attachment 285001 [details]
kernel_page_tables v3 (5.3-rc8 + ptdump patch, NO SMP)
Created attachment 285003 [details]
objdump usbcore v3 (5.3-rc8 + ptdump patch, NO SMP)
Created attachment 285005 [details]
objdump ohci-hcd v3 (5.3-rc8 + ptdump patch, NO SMP)
Created attachment 285007 [details]
objdump firewire-ohci v3 (5.3-rc8 + ptdump patch, NO SMP)
Created attachment 285009 [details]
objdump ehci-pci v3 (5.3-rc8 + ptdump patch, NO SMP)
(In reply to Christophe Leroy from comment #11) > I have carefully reviewed the flushing calls and have not been able to > identify any issue. > > Maybe a SMP issue ? Does this problem also happen without CONFIG_SMP ? For my 1st (v2) retest I applied the "powerpc/ptdump: Fix addresses display on PPC32" patch you linked on top of -rc8. For my 2nd (v3) retest I applied the patch and disabled SMP in the .config. As you can see the issue still persists without SMP. Also different 'problematic' modules popup even without SMP. Thanks. The third one is interesting, as it shows : Sep 16 12:57:39 T600 kernel: BUG: Unable to handle kernel data access at 0xfe295404 0xfe295000-0xfe295fff 0x00d05000 4K user r present accessed So unlike the other cases, this one shows that the area is still read-only and pointing to the zero shadow area. Wondering whether for the other ones the area is not allocated to a further module loader after the failing one. Are you able to dump kernel_page_tables just after each module load failure before loading any additional module ? Do you confirm that the two patches done in the scope of bug #204479 are included in your config, especially the first one: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/arch/powerpc/mm/kasan/kasan_init_32.c?id=663c0c9496a69f80011205ba3194049bcafd681d https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/arch/powerpc/mm/kasan/kasan_init_32.c?id=45ff3c55958542c3b76075d59741297b8cb31cbb (In reply to Christophe Leroy from comment #24) > Do you confirm that the two patches done in the scope of bug #204479 are > included in your config, especially the first one: > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/ > arch/powerpc/mm/kasan/kasan_init_32. > c?id=663c0c9496a69f80011205ba3194049bcafd681d > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/ > arch/powerpc/mm/kasan/kasan_init_32. > c?id=45ff3c55958542c3b76075d59741297b8cb31cbb Ahh, now we are getting closer... No, I did the testing with a vanilla 5.3-rc8 kernel assuming that powerpc/kasan: Fix shadow area set up for modules powerpc/kasan: Fix parallel loading of modules were already in 5.3-rc8. Now I realize I was wrong. In -rc8 the latest one (2019-07-31) is: powerpc/kasan: fix early boot failure on PPC32 The other two patches are dated 2019-08-20 and not in 5.3-rc8 or the newly released 5.3. Sorry for the noise! I will close the bug for now and retest as soon as the 2 patches are in 5.4-rcx or 5.3.x. If it still seems valid that time I will re-open. If you want to test things that are not quite in mainline you can always try the "merge" branch of the powerpc tree. That is a merge of master, next and fixes, so should have any fixes that are in the pipeline. |