Hi, We are using Kernel 4.10.0-19-generic. In one of our test drivers, we assign a task to the global work queue, using schedule_delayed_work(). The issue we observe, is that when we call schedule_delayed_work(task, n) with delay of "n" jiffies, the task is getting scheduled after a delay of "n+1" and not "n" jiffies as expected. This worked fine till the kernel version 4.4.0-87-generic. The observation is on kernel 4.10.0-19-generic and above. A sample test code to highlight the difference between the kernel is given below: static struct tq_struct g_Task = { routine:interrupt_handler, data:NULL }; static void interrupt_handler( struct work_struct *irrelevant ) { printk("check 1 Jiffies: %llu \n", get_jiffies_64()); schedule_delayed_work( &g_Task, 1); printk("Check 2 Jiffies: %llu \n", get_jiffies_64()); } The dmesg output on 4.10 is as below (diff of 2 jiffies) [ 1047.975259] check 1 Jiffies: 4295153276 [ 1047.975263] Check 2 Jiffies: 4295153276 [ 1047.983255] check 1 Jiffies: 4295153278 [ 1047.983259] Check 2 Jiffies: 4295153278 [ 1047.991255] check 1 Jiffies: 4295153280 [ 1047.991260] Check 2 Jiffies: 4295153280 The dmesg output on 4.4 is as below (diff of 1 jiffies) [ 830.490657] check 1 Jiffies: 4295098955 [ 830.490664] Check 2 Jiffies: 4295098955 [ 830.494657] check 1 Jiffies: 4295098956 [ 830.494667] Check 2 Jiffies: 4295098956 [ 830.498656] check 1 Jiffies: 4295098957 [ 830.498663] Check 2 Jiffies: 4295098957 Thanks, Rajendra
It might be more useful if you can bisect down to the specific commit that caused the change in behavior. All of the timer/delay interfaces promise "not less then" behavior, so we may not return/fire early, but there's no guarantee to fire close to the requested time. Thus it is often the case where timers delay for N+1 the requested number of ticks, since we don't know where in-between the ticks we might be. Thus if half of a tick had passed, firing at the next jiffy would be returning after only half a jiffy, thus we must wait for 2 jiffies. So as described it doesn't seem to technically be broken, but it would be good to narrow down the change in behavior to make sure it was intentional.
Root Cause: Looks like granularity at each level adds up extra delay to the one requested. root@nskubuntu:~# uname -r 5.4.0-47-generic When we add a timer to expires after 1 tick, it expires after 2 ticks. The granularity at level 0 is 1 tick. Oct 13 22:26:57 nskubuntu kernel: [115889.955708] check Jiffies: 4323865246 Oct 13 22:26:57 nskubuntu kernel: [115889.963723] check Jiffies: 4323865248 Oct 13 22:26:57 nskubuntu kernel: [115889.971739] check Jiffies: 4323865250 Oct 13 22:26:57 nskubuntu kernel: [115889.979723] check Jiffies: 4323865252 Oct 13 22:26:57 nskubuntu kernel: [115889.987737] check Jiffies: 4323865254 Oct 13 22:26:57 nskubuntu kernel: [115889.995726] check Jiffies: 4323865256 Oct 13 22:26:57 nskubuntu kernel: [115890.003723] check Jiffies: 4323865258 Oct 13 22:26:57 nskubuntu kernel: [115890.011739] check Jiffies: 4323865260 Oct 13 22:26:57 nskubuntu kernel: [115890.019725] check Jiffies: 4323865262 Oct 13 22:26:57 nskubuntu kernel: [115890.027737] check Jiffies: 4323865264 When we add a timer to expires after 64 tick, it expires after 72 ticks. The granularity at level 1 is 8 ticks. Oct 13 22:28:14 nskubuntu kernel: [115967.247050] check Jiffies: 4323884569 Oct 13 22:28:14 nskubuntu kernel: [115967.535125] check Jiffies: 4323884641 Oct 13 22:28:15 nskubuntu kernel: [115967.823118] check Jiffies: 4323884713 Oct 13 22:28:15 nskubuntu kernel: [115968.107091] check Jiffies: 4323884784 Oct 13 22:28:15 nskubuntu kernel: [115968.395077] check Jiffies: 4323884856 Oct 13 22:28:15 nskubuntu kernel: [115968.683116] check Jiffies: 4323884928 Oct 13 22:28:16 nskubuntu kernel: [115968.971041] check Jiffies: 4323885000 Oct 13 22:28:16 nskubuntu kernel: [115969.263116] check Jiffies: 4323885073 Oct 13 22:28:16 nskubuntu kernel: [115969.547107] check Jiffies: 4323885144 Oct 13 22:28:17 nskubuntu kernel: [115969.835098] check Jiffies: 4323885216 When we add a timer to expires after 512 tick, it expires after 576 ticks. The granularity at level 2 is 64 ticks Oct 13 22:31:24 nskubuntu kernel: [116157.609664] check Jiffies: 4323932160 Oct 13 22:31:27 nskubuntu kernel: [116159.913630] check Jiffies: 4323932736 Oct 13 22:31:29 nskubuntu kernel: [116162.217594] check Jiffies: 4323933312 Oct 13 22:31:31 nskubuntu kernel: [116164.521617] check Jiffies: 4323933888 Oct 13 22:31:34 nskubuntu kernel: [116166.825527] check Jiffies: 4323934464 Oct 13 22:31:36 nskubuntu kernel: [116169.133540] check Jiffies: 4323935041 Oct 13 22:31:38 nskubuntu kernel: [116171.433534] check Jiffies: 4323935616 Oct 13 22:31:40 nskubuntu kernel: [116173.741493] check Jiffies: 4323936193 Changes: calc_index, calc_wheel_index, LVL_START may need changes. Still working on it to find the right changes. Test Code: int count = 1; int delay_jiffies = 1; //need to change this to 64 and then to 512 for each run struct timer_list my_timer; /* * TIMER FUNCTION * */ static void timer_function(struct timer_list *t){ unsigned long local_jiffies; local_jiffies = jiffies; printk("check Jiffies: %lu \n", local_jiffies); if(count < 10) { mod_timer(&my_timer, local_jiffies + delay_jiffies); count++; } } static int hello_init(void) { my_timer.expires = jiffies + 1 ; my_timer.function = timer_function; add_timer(&my_timer); return 0; } static void hello_exit(void) { del_timer(&my_timer); } module_init(hello_init); module_exit(hello_exit);
On Tue, Oct 13 2020 at 17:46, bugzilla-daemon wrote: > --- Comment #2 from Senthil Kumar Nagappan (sknagappan@yahoo.com) --- > Root Cause: > Looks like granularity at each level adds up extra delay to the one > requested. That's not a bug, that's a feature. > When we add a timer to expires after 1 tick, it expires after 2 ticks. The > granularity at level 0 is 1 tick. Correct, because the timer is guaranteed to sleep at least 1 jiffie. So if the timer is queued right at the edge of the next jiffie then it would expire immediately, which would violate the guarantee that it sleeps at least 1 jiffie. > When we add a timer to expires after 64 tick, it expires after 72 ticks. The > granularity at level 1 is 8 ticks. Correct, because the longer sleep times are batched to the granularity for power saving reasons. See the big comment on top of kernel/time/timers.c and the changelog of the commit which introduced the redesign for in depth explanation. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=500462a9de657f86edaa102f8ab6bff7f7e43fc2 The batching in the outer levels (farther out expiry time) has been there before this change already. The change made this batching an implicit property of the wheel storage while the previous variant had to do an explicit computation to achieve that. So again, that's not a bug. It's a feature and documented behaviour. Thanks, tglx