Bug 79661 - syscalls with number >= NR_syscalls not returning an error
Summary: syscalls with number >= NR_syscalls not returning an error
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: i386 (show other bugs)
Hardware: i386 Linux
: P1 normal
Assignee: platform_i386
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-08 13:35 UTC by Mark Davies
Modified: 2020-08-31 19:38 UTC (History)
6 users (show)

See Also:
Kernel Version: 3.4.96, 3.10.46, 3.15.3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Kernel config file used for 3.4.96 (66.81 KB, application/octet-stream)
2014-07-08 13:35 UTC, Mark Davies
Details
cpuinfo (1.54 KB, application/octet-stream)
2014-07-22 01:56 UTC, Mark Davies
Details
sysenter_badsys.fix (378 bytes, application/octet-stream)
2014-07-23 22:19 UTC, Mark Davies
Details

Description Mark Davies 2014-07-08 13:35:13 UTC
Created attachment 142451 [details]
Kernel config file used for 3.4.96

When I upgraded from kernel 3.4.95 to 3.4.96 module loading stopped working. A bisect gave the first bad commit as
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=d3007333170435135f0620fb52386aad3fb2a14f

As I could not guess what that change did I check the user space
programs and discovered that a call to finit_module was not returning -1
as it should but 350. Also I discovered if I strace the module loading
program all was well. The above commit change involves a comparison to
to NR_syscalls so I did some tests to see how syscall fuctioned when the
number >= NR_syscalls and there seems to be a problem (see below). If
the syscall number is >= NR_syscall its returns the number unless
straced.


Tests performed using the following code + perl.

#include <stdio.h>
#include <sys/syscall.h>
#include <errno.h>
#include <string.h>

int
main()
{
   long n = SYS_finit_module;
   int e = syscall(n);
   printf("syscall(%ld) = %d (%s)\n", n, e, strerror(errno));
   n = 999;
   e = syscall(n);
   printf("syscall(%ld) = %d (%s)\n", n, e, strerror(errno));
   return 0;
}


So on 3.4.96 NR_syscalls = 349

$ uname -a
Linux amd 3.4.96 #14 SMP Mon Jul 7 20:44:07 BST 2014 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux
$ ./a.out
syscall(350) = 350 (Success)
syscall(999) = 999 (Success)
$ strace -o /dev/null ./a.out
syscall(350) = -1 (Function not implemented)
syscall(999) = -1 (Function not implemented)
$ perl -e 'print syscall(349), " $!\n";'
349 
$ strace -o /dev/null perl -e 'print syscall(349), " $!\n";'
-1 Function not implemented



On 3.4.95 NR_syscalls = 349

$ uname -a
Linux amd 3.4.95 #4 SMP Mon Jul 7 02:27:24 BST 2014 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux
$ ./a.out
syscall(350) = -1 (Function not implemented)
syscall(999) = -1 (Function not implemented)
$ strace -o /dev/null ./a.out
syscall(350) = -1 (Function not implemented)
syscall(999) = -1 (Function not implemented)
$ perl -e 'print syscall(349), " $!\n";'
-1 Function not implemented
$ strace -o /dev/null perl -e 'print syscall(349), " $!\n";'
-1 Function not implemented



Also went on to test 3.10.46 (NR_syscalls = 351) and 3.15.3 (NR_syscalls = 354)

$ uname -a
Linux amd 3.10.46 #4 SMP Mon Jul 7 16:34:59 BST 2014 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux
$ ./a.out
syscall(350) = -1 (Operation not permitted)
syscall(999) = 999 (Operation not permitted)
$ strace -o /dev/null ./a.out
syscall(350) = -1 (Operation not permitted)
syscall(999) = -1 (Function not implemented)
$ perl -e 'print syscall(351), " $!\n";'
351 
$ strace -o /dev/null perl -e 'print syscall(351), " $!\n";'
-1 Function not implemented

$ uname -a
Linux amd 3.15.3 #3 SMP Mon Jul 7 16:02:29 BST 2014 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux
$ ./a.out
syscall(350) = -1 (Operation not permitted)
syscall(999) = 999 (Operation not permitted)
$ strace -o /dev/null ./a.out
syscall(350) = -1 (Operation not permitted)
syscall(999) = -1 (Function not implemented)
$ perl -e 'print syscall(354), " $!\n";'
354 
$ strace -o /dev/null perl -e 'print syscall(354), " $!\n";'
-1 Function not implemented
Comment 1 Mark Davies 2014-07-22 01:56:29 UTC
Created attachment 143821 [details]
cpuinfo
Comment 2 Mark Davies 2014-07-23 22:19:29 UTC
Created attachment 144061 [details]
sysenter_badsys.fix

I think I have found the problem that the addition of sysenter_badsys
introduced in https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=d3007333170435135f0620fb52386aad3fb2a14f

The code flow when the syscall number is >= NR_syscalls now is as
follows - top to bottom assuming no jump to sysexit_audit.

sysenter_do_call:
	cmpl $(NR_syscalls), %eax
	jae sysenter_badsys

sysenter_badsys:
	movl $-ENOSYS,PT_EAX(%esp)
	jmp sysenter_after_call

sysenter_after_call:
	LOCKDEP_SYS_EXIT
	DISABLE_INTERRUPTS(CLBR_ANY)
	TRACE_IRQS_OFF
	movl TI_flags(%ebp), %ecx
	testl $_TIF_ALLWORK_MASK, %ecx
	jne sysexit_audit
sysenter_exit:
/* if something modifies registers it must also disable sysexit */
	movl PT_EIP(%esp), %edx
	movl PT_OLDESP(%esp), %ecx
	xorl %ebp,%ebp
	TRACE_IRQS_ON
1:	mov  PT_FS(%esp), %fs
	PTGS_TO_GS
	ENABLE_INTERRUPTS_SYSEXIT


eax contains the syscall number at start and is never changed. Thus the
return value from the syscall is the original syscall number and no
error is detected in userspace. Modifying sysenter_badsys to set eax to
the error solves the problem, i.e. sysenter_badsys becomes

sysenter_badsys:
	movl $-ENOSYS,%eax
	movl %eax,PT_EAX(%esp)
	jmp sysenter_after_call

With the previous syscall_badsys path and when straceing the calls go
through syscall_exit I believe. I think that path must be setting eax
from PT_EAX(%esp) perhaps in restore_all before returning to
userspace. I can't really follow what is code is doing (or suppose to be
doing) along that path.

Tested above change (sysenter_badsys.fix attachment) as before I get.
$ uname -a
Linux amd 3.4.96+ #26 SMP Tue Jul 22 17:32:57 BST 2014 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ AuthenticAMD GNU/Linux
$ ./a.out
syscall(350) = -1 (Function not implemented)
syscall(999) = -1 (Function not implemented)
$ strace -o /dev/null ./a.out
syscall(350) = -1 (Function not implemented)
syscall(999) = -1 (Function not implemented)
$ perl -e 'print syscall(349), " $!\n";'
-1 Function not implemented
$ strace -o /dev/null perl -e 'print syscall(349), " $!\n";'
-1 Function not implemented
Comment 3 Mark Davies 2014-07-24 00:13:04 UTC
Seem that someone else spotted the problem and has it covered. See https://lkml.org/lkml/2014/7/20/222

Note You need to log in before you can comment on or make changes to this bug.