Bug 205201

Summary: Booting halts if Dawicontrol DC-2976 UW SCSI board installed, unless RAM size limited to 3500M
Product: Platform Specific/Hardware Reporter: Roland (rj.ronkko)
Component: PPC-64Assignee: platform_ppc-64
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: christophe.leroy, chzigotzky, michael
Priority: P1    
Hardware: PPC-64   
OS: Linux   
Kernel Version: 5.3 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg fsl p5040
Kernel 5.4-rc6 config for the Cyrus+ board and for the QEMU ppce500 board (CPU: P5040 and P5020)
Patch for renaming the GFP_DMA32 to GFP_DMA
dmesg of Christoph's Git kernel

Description Roland 2019-10-15 14:36:12 UTC
Hardware environment:
AmigaOne X5000/20 (2 GHz), 8 GB RAM
Dawicontrol DC-2976 UW SCSI controller board (PCI) 

Linux distro: Ubuntu 16,04.6, Debian 8, Fienix (Not relevant, problem appears before loading starts).

Description: Booting halts immediately after loading the kernel, if  Dawicontrol DC-2976 UW SCSI boad is installed in the machine. If the RAM size is limited to 3500MB (with U-boot variable 'mem=3500M'), booting continues normally.

Steps to reproduce: Turn on the machine.
Comment 1 Christian Zigotzky 2019-10-16 21:48:17 UTC
I have the same problem with my analog PCI TV card Typhoon TView RDS + FM Stereo (BT878 chip) in my AmigaOne X5000. It only works with reducing the mem size to 3500MB (mem=3500M).
I figured out that the issue is somewhere in the PowerPC updates 4.21-1 [1]. Maybe the code change in the file "arch/powerpc/kernel/dma-swiotlb.c" [2] or maybe the change in the file "arch/powerpc/kernel/dma.c" [3].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d6973327ee84c2f40dd9efd8928d4a1186c96e2

[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/arch/powerpc/kernel/dma-swiotlb.c?id=8d6973327ee84c2f40dd9efd8928d4a1186c96e2

[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/arch/powerpc/kernel/dma.c?id=8d6973327ee84c2f40dd9efd8928d4a1186c96e2
Comment 2 Christian Zigotzky 2019-10-29 23:45:54 UTC
Error message without limitation to 3.5G RAM:

[   25.654852] bttv 1000:04:05.0: overflow 0x00000000fe077000+4096 of DMA mask ffffffff bus mask df000000

The kernel configured the bttv card for DMA in the upper region of RAM but the device believes that it only supports 32-bit addressing.
Comment 3 Christophe Leroy 2019-11-05 10:58:06 UTC
Can you bisect to identify the faulting commit ?
Comment 4 Christian Zigotzky 2019-11-05 14:33:31 UTC
Yes, I can. Could you please post the correct commits for starting bisect?

git bisect start

git bisect good ?

git bisect bad ?

Thanks
Comment 5 Christophe Leroy 2019-11-05 14:47:08 UTC
I guess:

git bisect bad 8d6973327
git bisect good v4.20
Comment 6 Christian Zigotzky 2019-11-06 19:36:15 UTC
FYI because of the issue with some PCI cards (SCSI, TV cards etc):

Christoph Hellwig wrote:
Can you send me the .config and a dmesg?  And in the meantime try the patch below?

From 4d659b7311bd4141fdd3eeeb80fa2d7602ea01d4 Mon Sep 17 00:00:00 2001
From: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Date: Fri, 18 Oct 2019 13:00:43 +0200
Subject: dma-direct: check for overflows on 32 bit DMA addresses

As seen on the new Raspberry Pi 4 and sta2x11's DMA implementation it is
possible for a device configured with 32 bit DMA addresses and a partial
DMA mapping located at the end of the address space to overflow. It
happens when a higher physical address, not DMAable, is translated to
it's DMA counterpart.

For example the Raspberry Pi 4, configurable up to 4 GB of memory, has
an interconnect capable of addressing the lower 1 GB of physical memory
with a DMA offset of 0xc0000000. It transpires that, any attempt to
translate physical addresses higher than the first GB will result in an
overflow which dma_capable() can't detect as it only checks for
addresses bigger then the maximum allowed DMA address.

Fix this by verifying in dma_capable() if the DMA address range provided
is at any point lower than the minimum possible DMA address on the bus.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
---
include/linux/dma-direct.h | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index adf993a3bd58..6ad9e9ea7564 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -3,6 +3,7 @@
#define _LINUX_DMA_DIRECT_H 1

#include <linux/dma-mapping.h>
+#include <linux/memblock.h> /* for min_low_pfn */
#include <linux/mem_encrypt.h>

#ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
@@ -27,6 +28,13 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
   if (!dev->dma_mask)
       return false;

+#ifndef CONFIG_ARCH_DMA_ADDR_T_64BIT
+    /* Check if DMA address overflowed */
+    if (min(addr, addr + size - 1) <
+        __phys_to_dma(dev, (phys_addr_t)(min_low_pfn << PAGE_SHIFT)))
+        return false;
+#endif
+
   return addr + size - 1 <=
       min_not_zero(*dev->dma_mask, dev->bus_dma_mask);
}
Comment 7 Christian Zigotzky 2019-11-07 08:46:34 UTC
Unfortunately this patch doesn't solve the issue. 

Error message:

    [    6.041163] bttv: driver version 0.9.19 loaded
    [    6.041167] bttv: using 8 buffers with 2080k (520 pages) each for capture
    [    6.041559] bttv: Bt8xx card found (0)
    [    6.041609] bttv: 0: Bt878 (rev 17) at 1000:04:05.0, irq: 19, latency: 128, mmio: 0xc20001000
    [    6.041622] bttv: 0: using: Typhoon TView RDS + FM Stereo / KNC1 TV Station RDS [card=53,insmod option]
    [    6.042216] bttv: 0: tuner type=5
    [    6.111994] bttv: 0: audio absent, no audio device found!
    [    6.176425] bttv: 0: Setting PLL: 28636363 => 35468950 (needs up to 100ms)
    [    6.200005] bttv: PLL set ok
    [    6.209351] bttv: 0: registered device video0
    [    6.211576] bttv: 0: registered device vbi0
    [    6.214897] bttv: 0: registered device radio0
    [  114.218806] bttv 1000:04:05.0: overflow 0x00000000ff507000+4096 of DMA mask ffffffff bus mask df000000
    [  114.218848] Modules linked in: rfcomm bnep tuner_simple tuner_types tea5767 tuner tda7432 tvaudio msp3400 bttv tea575x tveeprom videobuf_dma_sg videobuf_core rc_core videodev mc btusb btrtl btbcm btintel bluetooth uio_pdrv_genirq uio ecdh_generic ecc
    [  114.219012] [c0000001ecddf720] [80000000008ff6e8] .buffer_prepare+0x150/0x268 [bttv]
    [  114.219029] [c0000001ecddf860] [80000000008fff6c] .bttv_qbuf+0x50/0x64 [bttv]
Comment 8 Christian Zigotzky 2019-11-07 09:11:16 UTC
Trace:

[  462.783184] Call Trace:
[  462.783187] [c0000001c6c67420] [c0000000000b3358] .report_addr+0xb8/0xc0 (unreliable)
[  462.783192] [c0000001c6c67490] [c0000000000b351c] .dma_direct_map_page+0xf0/0x128
[  462.783195] [c0000001c6c67530] [c0000000000b35b0] .dma_direct_map_sg+0x5c/0xac
[  462.783205] [c0000001c6c675e0] [8000000000862e88] .__videobuf_iolock+0x660/0x6d8 [videobuf_dma_sg]
[  462.783220] [c0000001c6c676b0] [8000000000854274] .videobuf_iolock+0x98/0xb4 [videobuf_core]
[  462.783271] [c0000001c6c67720] [80000000008686e8] .buffer_prepare+0x150/0x268 [bttv]
[  462.783276] [c0000001c6c677c0] [8000000000854afc] .videobuf_qbuf+0x2b8/0x428 [videobuf_core]
[  462.783288] [c0000001c6c67860] [8000000000868f6c] .bttv_qbuf+0x50/0x64 [bttv]
[  462.783383] [c0000001c6c678e0] [80000000007bf208] .v4l_qbuf+0x54/0x60 [videodev]
[  462.783402] [c0000001c6c67970] [80000000007c1eac] .__video_do_ioctl+0x30c/0x3f8 [videodev]
[  462.783421] [c0000001c6c67a80] [80000000007c3c08] .video_usercopy+0x18c/0x3dc [videodev]
[  462.783440] [c0000001c6c67c00] [80000000007bb14c] .v4l2_ioctl+0x60/0x78 [videodev]
[  462.783460] [c0000001c6c67c90] [80000000007d3c48] .v4l2_compat_ioctl32+0x9b4/0x1850 [videodev]
[  462.783468] [c0000001c6c67d70] [c0000000001ad9cc] .__se_compat_sys_ioctl+0x284/0x127c
[  462.783473] [c0000001c6c67e20] [c00000000000067c] system_call+0x60/0x6c
[  462.783475] Instruction dump:
[  462.783477] 40fe0044 60000000 892255d0 2f890000 40fe0020 3c82ffc5 39200001 60000000 
[  462.783483] 38842029 992255d0 485ad0d9 60000000 <0fe00000> 38210070 e8010010 7c0803a6 
[  462.783490] ---[ end trace b677d4a00458e277 ]---
Comment 9 Christian Zigotzky 2019-11-07 09:13:38 UTC
Created attachment 285813 [details]
dmesg fsl p5040
Comment 10 Christian Zigotzky 2019-11-07 09:18:54 UTC
Created attachment 285815 [details]
Kernel 5.4-rc6 config for the Cyrus+ board and for the QEMU ppce500 board (CPU: P5040 and P5020)
Comment 11 Christian Zigotzky 2019-11-10 07:11:28 UTC
Christoph,

Do you have another patch for testing or shall I bisect?

Thanks,
Christian
Comment 12 Christian Zigotzky 2019-11-11 07:38:01 UTC
Hi Christoph,

I have seen that I have activated the kernel config option CONFIG_ARCH_DMA_ADDR_T_64BIT. That means your code in your patch won't work if this kernel option is enabled.

+#ifndef CONFIG_ARCH_DMA_ADDR_T_64BIT
+    /* Check if DMA address overflowed */
+    if (min(addr, addr + size - 1) <
+        __phys_to_dma(dev, (phys_addr_t)(min_low_pfn << PAGE_SHIFT)))
+        return false;
+#endif

I will delete the lines with ifndef and endif and will try it again.

Cheers,
Christian
Comment 13 Christian Zigotzky 2019-11-11 12:19:35 UTC
Christoph,

Now, I can definitely say that this patch does not solve the issue.

Do you have another patch for testing or shall I bisect?

Thanks,
Christian
Comment 14 Christian Zigotzky 2019-11-13 11:01:10 UTC
Created attachment 285889 [details]
Patch for renaming the GFP_DMA32 to GFP_DMA

Hi All,

The issue with the BT878 TV cards is solved. :-)

GFP_DMA32 was renamed to GFP_DMA in the PowerPC updates 4.21-1 in 
December last year.

Some PCI devices still use GFP_DMA32 (grep -r GFP_DMA32 *). I renamed 
GFP_DMA32 to GFP_DMA in the file 
"drivers/media/v4l2-core/videobuf-dma-sg.c". After compiling the RC7 of 
kernel 5.4, my BT878 TV card works again.

I created a patch for renaming GFP_DMA32 to GFP_DMA.

This patch doesn't solve the issue with the Dawicontrol DC-2976 UW SCSI board. Please help us to solve the issue with the SCSI board too.

Cheers,
Christian
Comment 15 Christian Zigotzky 2019-11-16 06:48:21 UTC
FYI: Souce files of the Dawicontrol DC 2976 UW SCSI board (PCI): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/sym53c8xx_2?h=v5.4-rc7

/*
 *  DMA addressing mode.
 *
 *  0 : 32 bit addressing for all chips.
 *  1 : 40 bit addressing when supported by chip.
 *  2 : 64 bit addressing when supported by chip,
 *      limited to 16 segments of 4 GB -> 64 GB max.
 */
#define   SYM_CONF_DMA_ADDRESSING_MODE CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE

Cyrus config:

CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1

I will configure “0 : 32 bit addressing for all chips” for the RC8. Maybe this is the solution.
Comment 16 Christian Zigotzky 2019-11-23 11:28:00 UTC
Created attachment 286031 [details]
dmesg of Christoph's Git kernel

Christoph Hellwig wrote:

I think we have two sorta overlapping issues here.  One is that I think
we need the bus_dma_limit, which should mostly help for something like
a SCSI controller that doesn't need streaming mappings (btw, do we
have more details on that somewhere?).

And something weird with the videobuf things.  Your change of the dma
masks suggests that the driver doesn't do the right allocations and thus
hits bounce buffering (swiotlb).  We should fix that for real, but the
fact that the bounce buffering itself also fails is even more interesting.

Can you try this git branch:

    git://git.infradead.org/users/hch/misc.git fsl-dma-debugging

Gitweb:

    http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/fsl-dma-debugging

and send me the dmesg with that with your TV adapter?

-----------------------------

Hello Christoph,

Here is the dmesg of your Git kernel.

Thanks,
Christian
Comment 17 Roland 2020-01-10 10:15:11 UTC
Kernel 5.5 alpha 1 fixed the issue with Dawicontrol DC-2976 UW SCSI board. Also a RTL 8169 ethernet card which had similar type of problem with earlier kernels works now with full (8 GB) Ram.