Created attachment 142451 [details] Kernel config file used for 3.4.96 When I upgraded from kernel 3.4.95 to 3.4.96 module loading stopped working. A bisect gave the first bad commit as https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=d3007333170435135f0620fb52386aad3fb2a14f As I could not guess what that change did I check the user space programs and discovered that a call to finit_module was not returning -1 as it should but 350. Also I discovered if I strace the module loading program all was well. The above commit change involves a comparison to to NR_syscalls so I did some tests to see how syscall fuctioned when the number >= NR_syscalls and there seems to be a problem (see below). If the syscall number is >= NR_syscall its returns the number unless straced. Tests performed using the following code + perl. #include <stdio.h> #include <sys/syscall.h> #include <errno.h> #include <string.h> int main() { long n = SYS_finit_module; int e = syscall(n); printf("syscall(%ld) = %d (%s)\n", n, e, strerror(errno)); n = 999; e = syscall(n); printf("syscall(%ld) = %d (%s)\n", n, e, strerror(errno)); return 0; } So on 3.4.96 NR_syscalls = 349 $ uname -a Linux amd 3.4.96 #14 SMP Mon Jul 7 20:44:07 BST 2014 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux $ ./a.out syscall(350) = 350 (Success) syscall(999) = 999 (Success) $ strace -o /dev/null ./a.out syscall(350) = -1 (Function not implemented) syscall(999) = -1 (Function not implemented) $ perl -e 'print syscall(349), " $!\n";' 349 $ strace -o /dev/null perl -e 'print syscall(349), " $!\n";' -1 Function not implemented On 3.4.95 NR_syscalls = 349 $ uname -a Linux amd 3.4.95 #4 SMP Mon Jul 7 02:27:24 BST 2014 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux $ ./a.out syscall(350) = -1 (Function not implemented) syscall(999) = -1 (Function not implemented) $ strace -o /dev/null ./a.out syscall(350) = -1 (Function not implemented) syscall(999) = -1 (Function not implemented) $ perl -e 'print syscall(349), " $!\n";' -1 Function not implemented $ strace -o /dev/null perl -e 'print syscall(349), " $!\n";' -1 Function not implemented Also went on to test 3.10.46 (NR_syscalls = 351) and 3.15.3 (NR_syscalls = 354) $ uname -a Linux amd 3.10.46 #4 SMP Mon Jul 7 16:34:59 BST 2014 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux $ ./a.out syscall(350) = -1 (Operation not permitted) syscall(999) = 999 (Operation not permitted) $ strace -o /dev/null ./a.out syscall(350) = -1 (Operation not permitted) syscall(999) = -1 (Function not implemented) $ perl -e 'print syscall(351), " $!\n";' 351 $ strace -o /dev/null perl -e 'print syscall(351), " $!\n";' -1 Function not implemented $ uname -a Linux amd 3.15.3 #3 SMP Mon Jul 7 16:02:29 BST 2014 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux $ ./a.out syscall(350) = -1 (Operation not permitted) syscall(999) = 999 (Operation not permitted) $ strace -o /dev/null ./a.out syscall(350) = -1 (Operation not permitted) syscall(999) = -1 (Function not implemented) $ perl -e 'print syscall(354), " $!\n";' 354 $ strace -o /dev/null perl -e 'print syscall(354), " $!\n";' -1 Function not implemented
Created attachment 143821 [details] cpuinfo
Created attachment 144061 [details] sysenter_badsys.fix I think I have found the problem that the addition of sysenter_badsys introduced in https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=d3007333170435135f0620fb52386aad3fb2a14f The code flow when the syscall number is >= NR_syscalls now is as follows - top to bottom assuming no jump to sysexit_audit. sysenter_do_call: cmpl $(NR_syscalls), %eax jae sysenter_badsys sysenter_badsys: movl $-ENOSYS,PT_EAX(%esp) jmp sysenter_after_call sysenter_after_call: LOCKDEP_SYS_EXIT DISABLE_INTERRUPTS(CLBR_ANY) TRACE_IRQS_OFF movl TI_flags(%ebp), %ecx testl $_TIF_ALLWORK_MASK, %ecx jne sysexit_audit sysenter_exit: /* if something modifies registers it must also disable sysexit */ movl PT_EIP(%esp), %edx movl PT_OLDESP(%esp), %ecx xorl %ebp,%ebp TRACE_IRQS_ON 1: mov PT_FS(%esp), %fs PTGS_TO_GS ENABLE_INTERRUPTS_SYSEXIT eax contains the syscall number at start and is never changed. Thus the return value from the syscall is the original syscall number and no error is detected in userspace. Modifying sysenter_badsys to set eax to the error solves the problem, i.e. sysenter_badsys becomes sysenter_badsys: movl $-ENOSYS,%eax movl %eax,PT_EAX(%esp) jmp sysenter_after_call With the previous syscall_badsys path and when straceing the calls go through syscall_exit I believe. I think that path must be setting eax from PT_EAX(%esp) perhaps in restore_all before returning to userspace. I can't really follow what is code is doing (or suppose to be doing) along that path. Tested above change (sysenter_badsys.fix attachment) as before I get. $ uname -a Linux amd 3.4.96+ #26 SMP Tue Jul 22 17:32:57 BST 2014 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux $ ./a.out syscall(350) = -1 (Function not implemented) syscall(999) = -1 (Function not implemented) $ strace -o /dev/null ./a.out syscall(350) = -1 (Function not implemented) syscall(999) = -1 (Function not implemented) $ perl -e 'print syscall(349), " $!\n";' -1 Function not implemented $ strace -o /dev/null perl -e 'print syscall(349), " $!\n";' -1 Function not implemented
Seem that someone else spotted the problem and has it covered. See https://lkml.org/lkml/2014/7/20/222