It appears that the upper limit of the number of cpus using to calculate the scaling factor in get_update_sysctl_factor() in kernel/sched/fair.c is hard-coded as 8. Is this intentional? Systems nowadays have far more CPUs. unsigned int cpus = min_t(unsigned int, num_online_cpus(), 8); As per this article: https://thehftguy.com/2023/11/14/the-linux-kernel-has-been-accidentally-hardcoded-to-a-maximum-of-8-cores-for-nearly-20-years/?fbclid=IwAR1g-5xkqFhhCtW5bjNawQanctFmMFObKM-q9G2eMKS3pV8532Nso1KVtJg
Created attachment 305403 [details] example fix Example fix. Tested on a 24 core Alderlake i9-12900 with stress-ng. No performance issues with schedmix, workload, fork and pthread stressors. I observed a 15.3% improvement in context switches for the fifo stressor.
So, I don't think it's correct to say that the limit was accidentally introduced [1], and people certainly have noticed before [2]. No comment on the merit of the change, but _assuming_ there is value in allowing this parameter to scale beyond 8 cores, I think the granularity of ilog2 might become unwieldy — 24 cores would get the same value as 16, 96 cores the same as 64 etc. [1] https://lore.kernel.org/lkml/1259253950.31676.249.camel@laptop/ [2] https://lore.kernel.org/all/CAKfTPtAKpMj15dHO1MC=dH_XJQe1Os24k93N2jDZ=kgg3O7K7A@mail.gmail.com/#t
Peter already gave a quick reply on this regard couple years ago [1] and I think more tests and numbers are required in different environments, instead of solely think in the raw number of cores. When a high throughput server starts to fall down based on the task time slices? Maybe increasing from 8 to 24 is fine in normal memory intensive tasks, but what about 128 cores in a high-demand network server? 8 has proven to be "enough" so far (compared to other OSes), but, of course, it doesn't mean it hasn't room for improvements. [1] https://lore.kernel.org/all/20211102160402.GX174703@worktop.programming.kicks-ass.net/