Bug 10361
Summary: | Compiling with CONFIG_RT_GROUP_SCHED breaks pam limits.conf rtprio assignment | ||
---|---|---|---|
Product: | Process Management | Reporter: | Viktor Radnai (viktor.radnai) |
Component: | Scheduler | Assignee: | Ingo Molnar (mingo) |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | a.p.zijlstra, aicacaten, alan, birthdaystock, viktor.radnai |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6..32 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Viktor Radnai
2008-03-30 03:06:27 UTC
This is _NOT_ a bug but expected behaviour, one you asked for by enabling RT group scheduling. RT group scheduling means you have to assign a bandwidth to the group before it will accept tasks. By default all bandwidth is assigned to the root group, if you want to assign bandwidth to another group, reduce the root group's bandwidth and assign some or all of the difference to another group. As RT scheduling is all about determinism, a group has to be able to rely on the amount of bandwidth being constant, hence the kernel cannot change this for you when the group configuration changes - you really have to do this yourself. Hi, OK, thanks for making that clear. I know that this is a new feature in 2.6.25 but I haven't found any explanation to the strange behaviour (that took me several days to debug) so I decided to report it. I am still a bit concerned about breaking (or modifying, I should say) functionality with only a generic error (EPERM). Maybe documentation on group scheduling will help with this, but I would have appreciated some kind of clue (dmesg, whatever) on what the "Permission denied" meant in this case (even if you read the source sched_setscheduler has five EPERM conditions). I can imagine that this would really stump someone who gets a kernel with this feature enabled by their distribution without knowing about it. Can you please think of some way to protect the innocent (and the clueless), some way to make it more obvious to the user what's wrong? Thanks in advance. Regards, Vik On Sun, 2008-03-30 at 14:49 -0700, bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=10361 > > > > > > ------- Comment #2 from viktor.radnai@gmail.com 2008-03-30 14:49 ------- > Hi, > > OK, thanks for making that clear. I know that this is a new feature in 2.6.25 > but I haven't found any explanation to the strange behaviour (that took me > several days to debug) so I decided to report it. Documentation/sched-rt-group.txt > I am still a bit concerned about breaking (or modifying, I should say) > functionality with only a generic error (EPERM). Maybe documentation on group > scheduling will help with this, but I would have appreciated some kind of > clue > (dmesg, whatever) on what the "Permission denied" meant in this case (even if > you read the source sched_setscheduler has five EPERM conditions). I can > imagine that this would really stump someone who gets a kernel with this > feature enabled by their distribution without knowing about it. > > Can you please think of some way to protect the innocent (and the clueless), > some way to make it more obvious to the user what's wrong? I see your point, however I'm not sure dmesg is the correct way; we'd set a precedent and eventually end up explaining every failing syscall. What we need is a better error value; one that signifies failure due to lack of resources, something like -ENOSPC and -ENOMEM. Perhaps -EBUSY can be used to signify the lack of CPU resources? > Documentation/sched-rt-group.txt Yes, I read that before filing the bug and it still didn't make sense at the time. It makes more sense now, but it's still not clear how to enable it (it's lack of background knowledge on my part). Anyway, this is just a matter of extending the documentation (which I would like to help with, after I tried out this feature), and it doesn't belong into Bugzilla. Oh, a very important realisation hit me just now. As a sysadmin looking after Java stuff, I'm mentally conditioned to understand "runtime" to be something totally different from what you mean (run + time ie the time the process group has been given to run). Much of my confusion has been caused by this :) > I see your point, however I'm not sure dmesg is the correct way; we'd > set a precedent and eventually end up explaining every failing syscall. > > What we need is a better error value; one that signifies failure due to > lack of resources, something like -ENOSPC and -ENOMEM. Perhaps -EBUSY > can be used to signify the lack of CPU resources? I agree with both. I actually considered other error codes myself, but couldn't find anything more appropriate in include/asm-generic/errno-base.h at the time that was better than your choice. I was hoping that those smarter than me would come up with a good way to notify userspace in a meaningful manner :) Now also looking through include/asm-generic/errno.h, EDQUOT grabbed my attention, but that explicitly means "disk quota", not *any* quota. I suppose all the above mean that some resource has ran out, but none of them are applicable to this case. But I need to learn more about realtime process groups before I can suggest a more meaningful alternative. Is there a chance of getting some new error codes added to include/asm-generic/errno.h? (I can see a bunch of subsystem-specific ones in there already) Still present - did anyone decide whether to change it or close it ? Reply-To: peterz@infradead.org On Tue, 2010-01-19 at 17:17 +0000, bugzilla-daemon@bugzilla.kernel.org > --- Comment #5 from Alan <alan@lxorguk.ukuu.org.uk> 2010-01-19 17:17:22 --- > Still present - did anyone decide whether to change it or close it ? I think it mostly depends on CONFIG_USER_SCHED, which places each user in a separate group, but since that code is depricated and will hopefully soon go away this issue should go away too. Please re-open against a current kernel if this is still a problem |