Bug 110931

Summary: parport cannot get buffer for DMA
Product: Drivers Reporter: Mark (mark_k)
Component: ParallelAssignee: drivers_parallel
Status: NEW ---    
Severity: normal CC: bp, joro, mark_k, sudipm.mukherjee, szg00000
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.4.0-040400-generic Subsystem:
Regression: No Bisected commit-id:
Attachments: patch
lspci -vvv
lspci -n -vvv

Description Mark 2016-01-17 15:04:26 UTC
I'm using Lubuntu x86-64 with mainline kernels from http://kernel.ubuntu.com/~kernel-ppa/mainline/

Dell Latitude D830 laptop connected to a PD01X docking station which provides amongst other things a native parallel port.

With kernel 4.3.3, I get this in dmesg output:
[    1.940373] parport_pc 00:06: reported by Plug and Play ACPI
[    1.940526] parport0: PC-style at 0x378 (0x778), irq 7, dma 1 [PCSPP,TRISTATE,COMPAT,EPP,ECP,DMA]
[    2.028364] lp0: using parport0 (interrupt-driven).

However with 4.4.0 there is a problem:

[    1.910959] parport_pc 00:06: reported by Plug and Play ACPI
[    1.911139] parport0: PC-style at 0x378 (0x778), irq 7, dma 1 [PCSPP,TRISTATE,COMPAT,EPP,ECP,DMA]
[    1.912394] hwdev DMA mask = 0x0000000000ffffff, dev_addr = 0x00000000dbe5b000
[    1.912476] swiotlb: coherent allocation failed for device 00:06 size=4096
[    1.912552] CPU: 1 PID: 222 Comm: systemd-modules Not tainted 4.4.0-040400-generic #201601101930
[    1.912645] Hardware name: Dell Inc. Latitude D830                   /0HN338, BIOS A17 06/19/2013
[    1.912745]  0000000000000000 00000000273b115d ffff8800db0d79a8 ffffffff813c8d94
[    1.913120]  0000000000ffffff ffff8800db0d79e8 ffffffff813f1e91 00000000dbe5b000
[    1.913489]  00000000024002c1 ffff880000039800 0000000000001000 ffff8800362f6de0
[    1.913854] Call Trace:
[    1.913942]  [<ffffffff813c8d94>] dump_stack+0x44/0x60
[    1.914029]  [<ffffffff813f1e91>] swiotlb_alloc_coherent+0x141/0x150
[    1.914117]  [<ffffffff81062503>] x86_swiotlb_alloc_coherent+0x43/0x50
[    1.914206]  [<ffffffffc00d9745>] parport_pc_probe_port+0x9b5/0x1190 [parport_pc]
[    1.914308]  [<ffffffff8152498c>] ? _dev_info+0x6c/0x90
[    1.914396]  [<ffffffffc00da2e3>] parport_pc_pnp_probe+0x143/0x1e0 [parport_pc]
[    1.914496]  [<ffffffffc00da1a0>] ? parport_pc_pci_probe+0x280/0x280 [parport_pc]
[    1.914595]  [<ffffffff81497911>] pnp_device_probe+0x61/0xc0
[    1.914680]  [<ffffffff81528a82>] driver_probe_device+0x222/0x4a0
[    1.914768]  [<ffffffff81528d84>] __driver_attach+0x84/0x90
[    1.914857]  [<ffffffff81528d00>] ? driver_probe_device+0x4a0/0x4a0
[    1.914945]  [<ffffffff815266ac>] bus_for_each_dev+0x6c/0xc0
[    1.915031]  [<ffffffff8152823e>] driver_attach+0x1e/0x20
[    1.915117]  [<ffffffff81527d7b>] bus_add_driver+0x1eb/0x280
[    1.915201]  [<ffffffff81529620>] driver_register+0x60/0xe0
[    1.915287]  [<ffffffff81497750>] pnp_register_driver+0x20/0x30
[    1.915374]  [<ffffffffc006c38e>] parport_pc_init+0x2b1/0xf23 [parport_pc]
[    1.915462]  [<ffffffffc006c0dd>] ? parport_parse_param.constprop.18+0xdd/0xdd [parport_pc]
[    1.915561]  [<ffffffff81002123>] do_one_initcall+0xb3/0x200
[    1.915646]  [<ffffffff811c91f1>] ? __vunmap+0x91/0xe0
[    1.915733]  [<ffffffff811e52eb>] ? kmem_cache_alloc_trace+0x16b/0x1d0
[    1.915821]  [<ffffffff811e6055>] ? kfree+0x115/0x130
[    1.915908]  [<ffffffff81186d13>] do_init_module+0x5f/0x1e5
[    1.915995]  [<ffffffff8110565b>] load_module+0x160b/0x1b80
[    1.916082]  [<ffffffff811018b0>] ? __symbol_put+0x60/0x60
[    1.916154]  [<ffffffff8120bee0>] ? kernel_read+0x50/0x80
[    1.916237]  [<ffffffff81105e19>] SyS_finit_module+0xb9/0xf0
[    1.916317]  [<ffffffff817fd9b6>] entry_SYSCALL_64_fastpath+0x16/0x75
[    1.916409] parport0: cannot get buffer for DMA, resorting to PIO operation
[    2.004194] lp0: using parport0 (interrupt-driven).
Comment 1 Mark 2016-01-17 21:56:44 UTC
Based on installing various 4.4-rcX kernels, it seems this problem appeared at some point between 4.4-rc7 and 4.4-rc8.
Comment 2 Sudip 2016-01-19 08:13:28 UTC
Created attachment 200441 [details]
patch

There was no change in parport between v4.3 and v4.4. But there were some other changes, you can find them at https://lkml.org/lkml/2015/6/8/76

I do not know anything about swiotlb.c but just for testing, can you please try the attached patch and see if the problem is solved. It will apply on v4.4.0 

regards
sudip
Comment 3 Mark 2016-01-19 20:11:30 UTC
I applied your patch but it didn't fix the problem.

I guess "Parallel" isn't actually the most relevant bugzilla component for this issue, but I'm not sure what is.

Maybe someone could add Joerg Roedel to the cc list since his changes were given in the lkml post you linked to?
Comment 4 Borislav Petkov 2016-01-20 08:49:29 UTC
A couple of points only:

If you mean this patch:

186dfc9d69b9 ("x86/swiotlb: Try coherent allocations with __GFP_NOWARN")

it went in in 4.2 and so it was already in 4.3. So I'm sceptical is the
it culprit.

Looking at the code, we're in here:

        /* Confirm address can be DMA'd by device */
        if (dev_addr + size - 1 > dma_mask) {
                printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
                       (unsigned long long)dma_mask,
                       (unsigned long long)dev_addr);

                /* DMA_TO_DEVICE to avoid memcpy in unmap_single */
                swiotlb_tbl_unmap_single(hwdev, paddr,
                                         size, DMA_TO_DEVICE);
                goto err_warn;
        }

and dev_addr overshot the DMA mask, so we can't DMA to that device. We
agot that ddress from map_single() ... I'll let Joerg decipher this
afurther.

The only suggestion I would have is to bisect this, if you can reproduce
it reliably and if you're sure it started appearing between 4.3 and 4.4.

Mark, feel free to ask if anything is not clear on how to bisect.

HTH.
Comment 5 Joerg Roedel 2016-01-20 10:54:29 UTC
(In reply to Mark from comment #0)

> [    1.912394] hwdev DMA mask = 0x0000000000ffffff, dev_addr =
> 0x00000000dbe5b000

The dma-mask indicates that the device can only access the first 16MB of memory (24bit DMA mask). But the swiotlb aperture is somewhere above this address, so the code has no chance to allocate something that fits into the mask.

My guess is that the mask is wrong and a patch changed it from an 32 bit into a 24 bit mask, breaking parport for you. Bisecting will show which patch is was. Can you please also upload the output of 'lspci -vvv' and 'lspci -n -vvv'?

Thanks, Joerg
Comment 6 Sudip 2016-01-20 11:14:22 UTC
The (In reply to Joerg Roedel from comment #5)
> (In reply to Mark from comment #0)
> 
> > [    1.912394] hwdev DMA mask = 0x0000000000ffffff, dev_addr =
> > 0x00000000dbe5b000
> 
> The dma-mask indicates that the device can only access the first 16MB of
> memory (24bit DMA mask). But the swiotlb aperture is somewhere above this
> address, so the code has no chance to allocate something that fits into the
> mask.
> 
> My guess is that the mask is wrong and a patch changed it from an 32 bit
> into a 24 bit mask.

The mask has always been 24 bit. Introduced with dfa7c4d869b7 ("parport_pc: set properly the dma_mask for parport_pc device")

regards
sudip
Comment 7 Joerg Roedel 2016-01-20 11:46:12 UTC
(In reply to Sudip from comment #6)
> The mask has always been 24 bit. Introduced with dfa7c4d869b7 ("parport_pc:
> set properly the dma_mask for parport_pc device")

Okay, then we need the bisect result to find out what broke it.
Comment 8 Mark 2016-01-20 14:22:55 UTC
I wonder... could it be that a change between 4.4-rc7 and 4.4-rc8 introduced the "coherent allocation failed..." diagnostic message, but that the allocation just failed silently on 4.4-rc7 and earlier kernels???

[However, wouldn't the message "parport0: cannot get buffer for DMA, resorting to PIO operation" have been printed on 4.4-rc7 & earlier too, if the allocation did fail there?]

Anyway, I can try bisecting. Any suggestions as to the earliest kernel to cover in the bisection?
Comment 9 Mark 2016-01-20 14:26:01 UTC
Created attachment 200571 [details]
lspci -vvv
Comment 10 Mark 2016-01-20 14:26:25 UTC
Created attachment 200581 [details]
lspci -n -vvv
Comment 11 Borislav Petkov 2016-01-20 15:47:54 UTC
(In reply to Mark from comment #8)
> [However, wouldn't the message "parport0: cannot get buffer for DMA,
> resorting to PIO operation" have been printed on 4.4-rc7 & earlier too, if
> the allocation did fail there?]

Looks like it, parport_pc_probe_port() which issues that warning hasn't
been changed around that timeframe. So you either get the DMA buffer and
no message at all or no DMA buffer but message gets printed.

> Anyway, I can try bisecting. Any suggestions as to the earliest kernel to
> cover in the bisection?

What I'd do is try 4.3 and if the message doesn't appear, try all the
-rc's since then: 4.4-rc1, 4.4-rc2, ... and see where it appears.

If it really appears between -rc7 and -rc8, then the bisection would be
a lot less work.

Thanks.
Comment 12 Mark 2016-01-21 22:17:15 UTC
I think I have found the cause of this problem. It's probably down to a change in the Ubuntu kernel configs, not a kernel bug after all.

$ diff config-4.4.0-040400rc7-generic config-4.4.0-040400rc8-generic 
...
393c393
< CONFIG_ZONE_DMA=y
---
> # CONFIG_ZONE_DMA is not set
517,518c517
< CONFIG_ZONE_DMA_FLAG=1
< CONFIG_BOUNCE=y
---
> CONFIG_ZONE_DMA_FLAG=0
545a545
> CONFIG_ZONE_DEVICE=y
...

Googling CONFIG_ZONE_DMA gave: "DMA memory allocation support allows devices with less than 32-bit addressing to allocate within the first 16MB of address space. Disable if no such devices will be used."

So I'll file a bug with Ubuntu, and this bug should be marked as invalid.
Comment 13 Mark 2016-01-21 22:25:27 UTC
Launchpad.net bug:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1536813
Comment 14 Borislav Petkov 2016-01-21 22:28:47 UTC
So how did that config-4.4.0-040400rc8-generic get created even? CONFIG_ZONE_DMA is enabled by default so why is it off in that rc8 config?
Comment 15 Joerg Roedel 2016-01-21 22:48:01 UTC
(In reply to Mark from comment #12)
> < CONFIG_ZONE_DMA=y
> ---
> > # CONFIG_ZONE_DMA is not set

That makes sense as the root cause. ZONE_DMA on x86 includes the first 16MB of physical memory. It is used to allocate memory which is guaranteed below 16MB, and if it is disabled your allocated memory could be above 16MB and trigger the warning in the parport code.
Comment 16 Sudip 2016-01-22 04:06:30 UTC
This looks similar to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1534647
Comment 17 Borislav Petkov 2016-01-22 12:34:24 UTC
Aha, there it is: 033fbae988fc ("mm: ZONE_DEVICE for "device memory"")
which is pmem. And I can imagine why ubuntu enable it and I can also
imagine other distros wanting to enable support for pmem too.

We probably should have a discussion upstream whether enabling
CONFIG_ZONE_DEVICE and breaking devices which need ZONE_DMA allocations
is fine.

Sudip, if you CC everyone from the bug and the commit above and lkml, it
would probably be a good start for figuring out this.

Thanks.
Comment 18 Mark 2016-01-22 14:48:08 UTC
Parallel ports are probably the most common affected device. 2016-model Dell laptops (still) have a dock port so can use native parallel & serial ports. Maybe the situation with HP & Lenovo machines is similar? And there are (probably) current desktop motherboards with parallel port.

Apart from that though, perhaps no ZONE_DMA means ISA sound & SCSI cards break. [There are a few current-model motherboards with an ISA slot, though I'm guessing most are used for legacy applications.]

I guess PCMCIA devices (including passive CompactFlash-to-PCMCIA adapters) could be impacted too?