Created attachment 303713 [details] perf data and kernel config We have some ICE lake server with 1TB memory installed. They were all running at 5.10.x branch LTS kernel and run fine. After we upgraded kernel to 5.15.x LTS kernel, the booting process were extremely slow. After some analysis, we realized that was caused by hugepage allocation. Our system used sysctl.conf to allocate hugepage at the boot time. In fact, "echo 960 > nr_hugepages" had the same effect. The only way to do a fast allocation is to use boot cmd option: "default_hugepagesz=1G hugepages=960". Our System is Xeon W3375 with 1TB memory installed. But this bug also occured with Xeon 8180 with 1.5TB memory too. Our OS is Gentoo Linux. With 5.10.x, the allocation speed is around 300GB/s, and 5.15.x only had 30GB/s. We also tried 6.1.1, it is the same as 5.15.x . We compiled 5.10.163 and 5.15.88 with debug option and used "perf -a -g sleep 2" to catch kernel functions. Here are two perf outputs. I hope this can help.
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Sat, 11 Feb 2023 06:54:45 +0000 bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=217022 > > Bug ID: 217022 > Summary: Extremely Slow Hugepage Allocation > Product: Memory Management > Version: 2.5 > Kernel Version: 5.15 > Hardware: All > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Page Allocator > Assignee: akpm@linux-foundation.org > Reporter: y.liu@naruida.com > Regression: No > > Created attachment 303713 [details] > --> https://bugzilla.kernel.org/attachment.cgi?id=303713&action=edit > perf data and kernel config > > We have some ICE lake server with 1TB memory installed. They were all running > at 5.10.x branch LTS kernel and run fine. After we upgraded kernel to 5.15.x > LTS kernel, the booting process were extremely slow. After some analysis, we > realized that was caused by hugepage allocation. Our system used sysctl.conf > to > allocate hugepage at the boot time. In fact, "echo 960 > nr_hugepages" had > the > same effect. The only way to do a fast allocation is to use boot cmd option: > "default_hugepagesz=1G hugepages=960". > > Our System is Xeon W3375 with 1TB memory installed. But this bug also occured > with Xeon 8180 with 1.5TB memory too. Our OS is Gentoo Linux. With 5.10.x, > the > allocation speed is around 300GB/s, and 5.15.x only had 30GB/s. We also tried > 6.1.1, it is the same as 5.15.x . > > We compiled 5.10.163 and 5.15.88 with debug option and used "perf -a -g sleep > 2" to catch kernel functions. Here are two perf outputs. I hope this can > help. > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are the assignee for the bug.
https://lore.kernel.org/all/20230414141429.pwgieuwluxwez3rj@techsingularity.net/