Bug 204119

Summary: scsi_mod: Could not allocate 4104 bytes percpu data
Product: IO/Storage Reporter: Jan Palus (jpalus)
Component: SCSIAssignee: linux-scsi (linux-scsi)
Status: NEW ---    
Severity: normal CC: bvanassche, justincase
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.2 Subsystem:
Regression: No Bisected commit-id:
Attachments: kernel config 5.2

Description Jan Palus 2019-07-09 20:16:02 UTC
After upgrading to 5.2 scsi_mod fails to insert (CONFIG_X86_VSMP is not set):

Jul 09 09:50:25 localhost kernel: percpu: allocation failed, size=4104 align=32 atomic=0, alloc from reserved chunk failed
Jul 09 09:50:25 localhost kernel: CPU: 0 PID: 372 Comm: systemd-udevd Tainted: G                T 5.2.0-0.1 #1
Jul 09 09:50:25 localhost kernel: Hardware name: System manufacturer System Product Name/Z170-PRO, BIOS 3801 03/14/2018
Jul 09 09:50:25 localhost kernel: Call Trace:
Jul 09 09:50:25 localhost kernel:  dump_stack+0x5c/0x78
Jul 09 09:50:25 localhost kernel:  pcpu_alloc.cold.12+0x22/0x45
Jul 09 09:50:25 localhost kernel:  ? __vmalloc_node_range+0x1cf/0x240
Jul 09 09:50:25 localhost kernel:  load_module+0xd8f/0x2500
Jul 09 09:50:25 localhost kernel:  ? map_vm_area+0x38/0x50
Jul 09 09:50:25 localhost kernel:  ? __vmalloc_node_range+0x1cf/0x240
Jul 09 09:50:25 localhost kernel:  ? __se_sys_init_module+0x136/0x160
Jul 09 09:50:25 localhost kernel:  __se_sys_init_module+0x136/0x160
Jul 09 09:50:25 localhost kernel:  do_syscall_64+0x5b/0x130
Jul 09 09:50:25 localhost kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jul 09 09:50:25 localhost kernel: RIP: 0033:0x7f3a5c452afe
Jul 09 09:50:25 localhost kernel: Code: 48 8b 0d 85 43 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 43 0c 00 f7 d8 64 89 01 48
Jul 09 09:50:25 localhost kernel: RSP: 002b:00007ffc9d2f6768 EFLAGS: 00000202 ORIG_RAX: 00000000000000af
Jul 09 09:50:25 localhost kernel: RAX: ffffffffffffffda RBX: 00000000008b5cf0 RCX: 00007f3a5c452afe
Jul 09 09:50:25 localhost kernel: RDX: 00007f3a5c7ae865 RSI: 000000000007f110 RDI: 00007f3a5a3c7010
Jul 09 09:50:25 localhost kernel: RBP: 00007f3a5c7ae865 R08: 000000000000005f R09: 00000000008b5ab0
Jul 09 09:50:25 localhost kernel: R10: 00000000008a0010 R11: 0000000000000202 R12: 00007f3a5a3c7010
Jul 09 09:50:25 localhost kernel: R13: 00000000008a4000 R14: 0000000000020000 R15: 00000000008b5cf0
Jul 09 09:50:25 localhost kernel: scsi_mod: Could not allocate 4104 bytes percpu data
Comment 1 Bart Van Assche 2019-07-09 21:18:30 UTC
From which kernel version did you upgrade to 5.2?
Comment 2 Jan Palus 2019-07-09 21:53:18 UTC
From 5.1.15 where it works reliably well.

Now I've noticed that this error seems somewhat random -- in most cases insert fails but every 10 boots or so it succeeds. Note that attempt to load module is made very early in initrd.
Comment 3 Jan Palus 2019-07-09 22:06:46 UTC
Not sure if that's relevant but reported percpu values differ between those versions

5.1:

percpu: Embedded 46 pages/cpu s151552 r8192 d28672 u524288

5.2:

percpu: Embedded 54 pages/cpu s184320 r8192 d28672 u524288
Comment 4 Bart Van Assche 2019-07-09 22:23:36 UTC
Can you share your kernel config?
Comment 5 Jan Palus 2019-07-09 22:33:38 UTC
Created attachment 283597 [details]
kernel config 5.2
Comment 6 Bart Van Assche 2019-07-10 03:09:32 UTC
The "size=4104" in the error message probably refers to the SCSI log buffer. From drivers/scsi/scsi_logging.c:

#define SCSI_LOG_SPOOLSIZE 4096
struct scsi_log_buf {
	char buffer[SCSI_LOG_SPOOLSIZE];
	unsigned long map;
};
static DEFINE_PER_CPU(struct scsi_log_buf, scsi_format_log);

I am not aware of any changes between kernel versions v5.1 and v5.2 in the SCSI logging mechanism so I don't think that this indicates a regression in the SCSI subsystem. Anyway, does this patch help?

diff --git a/drivers/scsi/scsi_logging.c b/drivers/scsi/scsi_logging.c
index 39b8cc4574b4..148d8635d5f6 100644
--- a/drivers/scsi/scsi_logging.c
+++ b/drivers/scsi/scsi_logging.c
@@ -15,7 +15,7 @@
 #include <scsi/scsi_eh.h>
 #include <scsi/scsi_dbg.h>
 
-#define SCSI_LOG_SPOOLSIZE 4096
+#define SCSI_LOG_SPOOLSIZE SCSI_LOG_BUFSIZE
 
 #if (SCSI_LOG_SPOOLSIZE / SCSI_LOG_BUFSIZE) > BITS_PER_LONG
 #warning SCSI logging bitmask too large
Comment 7 Jan Palus 2019-07-10 08:57:07 UTC
The patch seems to fix the issue -- 5 successful boots in a row and no trace of failed percpu allocation. Thanks.
Comment 8 Bart Van Assche 2019-07-11 02:38:03 UTC
Thanks for testing! I will submit a more elaborate patch after the merge window has closed.
Comment 9 Yill Din 2019-08-10 11:55:13 UTC
When find the patch the way into the 5.2 Kernel?
I got the same bug. I can't use any usb device. Nothing happens. But I'm not able to build my own kernel. I tried all Kernel-Versions of 5.2 (5.2.1 - 5.2.7 (arch Linux)).
Comment 10 Bart Van Assche 2019-08-14 21:04:40 UTC
This patch has been accepted in Martin's tree as commit dccc96abfb21 ("scsi: core: Reduce memory required for SCSI logging") and is on its way to kernel v5.4. If you need that patch in kernel v5.2 soon and if you cannot compile the kernel yourself then only your Linux distributor can help you.
Comment 11 Yill Din 2019-08-23 10:33:39 UTC
Thx for answer. 
I tried to patch my distribution-kernel with your bugfix. 

It works brilliant. I can use all my usb-devices again.