Bug 10203 - 2.6.25 IOMMU breaks DMA for b43 on x86_64
Summary: 2.6.25 IOMMU breaks DMA for b43 on x86_64
Status: REJECTED INSUFFICIENT_DATA
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-03-09 00:55 UTC by Christian Casteyde
Modified: 2009-03-23 10:11 UTC (History)
5 users (show)

See Also:
Kernel Version: 2.6.25-rc4
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Christian Casteyde 2008-03-09 00:55:19 UTC
Latest working kernel version: 2.6.24
Earliest failing kernel version: 2.6.24-rc4 (earliest known)
Distribution: Bluewhite64
Hardware Environment: Acer 1511LMi laptop, broadcomm wireless chipset
Software Environment: ifconfig, iwconfig
Problem Description:
With 2.6.25-rc4, the b43 driver on this laptop fails.
In order to scan / associate WPA network, the network interface must be up. So there is a "ifconfig eth1 up" somewhere in the start scripts, and this command simply fails with:

root@athor:/#ifconfig eth1 up
SIOCSIFFLAGS: Cannot allocate memory
root@athor:/#

That's all. There is strictly no error log in dmesg.
I didn't checked the wired interfaces, since I'm using only wireless. It may not be b43 specific therefore.

Steps to reproduce:
I guess, simply issue:
ifconfig eth1 up
on an laptop at least with a b43 driver.
Comment 1 Christian Casteyde 2008-03-09 01:49:41 UTC
It is indeed specific to b43.
I managed to configure the integrated, non wireless, network card.
Comment 2 Michael Buesch 2008-03-09 01:59:58 UTC
Please sprinkle some printks in the code to find the place where it throws -ENOMEM.
I really can't do anything with this bugreport.
Comment 3 Christian Casteyde 2008-03-09 03:30:17 UTC
Which kernel function should I start with? Where ifconfig up can end up in b43?
I've no clue of the kernel source structure, I usually use grep...
Comment 4 Michael Buesch 2008-03-09 03:45:01 UTC
b43_op_start()
Comment 5 Christian Casteyde 2008-03-09 07:13:58 UTC
OK, dma_init.
I've added several printk to b43, and I found this at boot:

b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 351.126 (2006-07-29 05:54:02)
b43-phy0 warning: You are using an old firmware image. Support for old firmware will be removed in July 2008.
b43-phy0 warning: You must go to http://linuxwireless.org/en/users/Drivers/b43#devicefirmware and download the latest firmware (version 4).
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12
b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 351.126 (2006-07-29 05:54:02)
b43-phy0 warning: You are using an old firmware image. Support for old firmware will be removed in July 2008.
b43-phy0 warning: You must go to http://linuxwireless.org/en/users/Drivers/b43#devicefirmware and download the latest firmware (version 4).
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12
b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 351.126 (2006-07-29 05:54:02)
b43-phy0 warning: You are using an old firmware image. Support for old firmware will be removed in July 2008.
b43-phy0 warning: You must go to http://linuxwireless.org/en/users/Drivers/b43#devicefirmware and download the latest firmware (version 4).
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12
b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 351.126 (2006-07-29 05:54:02)
b43-phy0 warning: You are using an old firmware image. Support for old firmware will be removed in July 2008.
b43-phy0 warning: You must go to http://linuxwireless.org/en/users/Drivers/b43#devicefirmware and download the latest firmware (version 4).
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12
warning: `dnsmasq' uses 32-bit capabilities (legacy support in use)
Marking TSC unstable due to cpufreq changes
Time: acpi_pm clocksource has been installed.
Clocksource tsc unstable (delta = -216948757 ns)

And this when ifconfig eth1 up:
b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 351.126 (2006-07-29 05:54:02)
b43-phy0 warning: You are using an old firmware image. Support for old firmware will be removed in July 2008.
b43-phy0 warning: You must go to http://linuxwireless.org/en/users/Drivers/b43#devicefirmware and download the latest firmware (version 4).
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12

The firmware I use is the one for 2.6.24, and it says it will be removed in July so it should still work for now, but I'm installing bleeding edge firmware to check if it works with it.
Comment 6 Michael Buesch 2008-03-09 07:23:05 UTC
so, well. It fails in dma_init because it's out of memory. That's not a bug, right?
Comment 7 Christian Casteyde 2008-03-09 08:04:27 UTC
Is it a joke? Of course, the computer memory shrinks a lot each time I boot 2.6.25-rc4, and is restored when I boot 2.6.24.3 :-) (unless b43 driver requiers 64BG of DMA memory!!)

I've investigated a little more:
1. It also happens with latest firmware:
root@athor:/home/christian# ifconfig eth1 up
b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 410.2160 (2007-05-26 15:32:10)
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12
SIOCSIFFLAGS: Cannot allocate memory

2. I added printks recursively and got:
b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 351.126 (2006-07-29 05:54:02)
b43-phy0 warning: You are using an old firmware image. Support for old firmware will be removed in July 2008.
b43-phy0 warning: You must go to http://linuxwireless.org/en/users/Drivers/b43#devicefirmware and download the latest firmware (version 4).
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_dma_init: before ssb_dma_set_mask
b43_setup_dmaring: after kzalloc1: ffff81004d084108
b43_setup_dmaring: after kzalloc2: ffff81004d84c000
b43_setup_dmaring: after kcalloc: ffff81004d8a0000
b43_setup_dmaring: after dma_map_single: 00000000de3a0000
b43_setup_dmaring: after ugh realloc kcalloc: ffff810000e0c000
b43_setup_dmaring: after dma_map_single 2: 00000000de3a1000
b43_setup_dmaring: error mapping dma 2
b43_dma_init: after b43_setup_dmaring 0 1: 0000000000000000
b43_dma_init: out: -12
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12
b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 351.126 (2006-07-29 05:54:02)
b43-phy0 warning: You are using an old firmware image. Support for old firmware will be removed in July 2008.
b43-phy0 warning: You must go to http://linuxwireless.org/en/users/Drivers/b43#devicefirmware and download the latest firmware (version 4).
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_dma_init: before ssb_dma_set_mask
b43_setup_dmaring: after kzalloc1: ffff81004d084108
b43_setup_dmaring: after kzalloc2: ffff81004d8e4000
b43_setup_dmaring: after kcalloc: ffff81004d8a0000
b43_setup_dmaring: after dma_map_single: 00000000dd224000
b43_setup_dmaring: after ugh realloc kcalloc: ffff810000e0c000
b43_setup_dmaring: after dma_map_single 2: 00000000dd225000
b43_setup_dmaring: error mapping dma 2
b43_dma_init: after b43_setup_dmaring 0 1: 0000000000000000
b43_dma_init: out: -12
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12
b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 351.126 (2006-07-29 05:54:02)
b43-phy0 warning: You are using an old firmware image. Support for old firmware will be removed in July 2008.
b43-phy0 warning: You must go to http://linuxwireless.org/en/users/Drivers/b43#devicefirmware and download the latest firmware (version 4).
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_dma_init: before ssb_dma_set_mask
b43_setup_dmaring: after kzalloc1: ffff81004d084108
b43_setup_dmaring: after kzalloc2: ffff81004d8ec000
b43_setup_dmaring: after kcalloc: ffff81004daa4000
b43_setup_dmaring: after dma_map_single: 00000000ddd42000
b43_setup_dmaring: after ugh realloc kcalloc: ffff810000e0c000
b43_setup_dmaring: after dma_map_single 2: 00000000ddd43000
b43_setup_dmaring: error mapping dma 2
b43_dma_init: after b43_setup_dmaring 0 1: 0000000000000000
b43_dma_init: out: -12
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12
b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 351.126 (2006-07-29 05:54:02)
b43-phy0 warning: You are using an old firmware image. Support for old firmware will be removed in July 2008.
b43-phy0 warning: You must go to http://linuxwireless.org/en/users/Drivers/b43#devicefirmware and download the latest firmware (version 4).
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_dma_init: before ssb_dma_set_mask
b43_setup_dmaring: after kzalloc1: ffff81004d084108
b43_setup_dmaring: after kzalloc2: ffff81004d8ec000
b43_setup_dmaring: after kcalloc: ffff81004daa4000
b43_setup_dmaring: after dma_map_single: 00000000ddd9e000
b43_setup_dmaring: after ugh realloc kcalloc: ffff810000e0c000
b43_setup_dmaring: after dma_map_single 2: 00000000ddd9f000
b43_setup_dmaring: error mapping dma 2
b43_dma_init: after b43_setup_dmaring 0 1: 0000000000000000
b43_dma_init: out: -12
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12

So it happens when 
                      if (b43_dma_mapping_error(ring, dma_test,
                                                  b43_txhdr_size(dev)))
is called in b43_setup_dmaring.

I haven't investigated why this function fails, I'm already quite deep, but its code seems to imply an address space layout / 64 bits problem. So maybe not a b43 problem finally...
Should I trace it (it's the last one obviously...)?
Comment 8 Michael Buesch 2008-03-09 08:11:10 UTC
You're on your own with this.
The function was introduced to fix another device.
As I don't have either of these devices that break here, I can't help you.

I suggest architectures finnaly fix their DMA code to correctly honor the dma mask, so we don't have to do this magic in the driver.
But that's an issue since years and nobody cares.
Comment 9 Christian Casteyde 2008-03-09 08:33:33 UTC
Glorps.

Anyway, I traced that function:

b43_op_start: enter
b43_op_start: before b43_wireless_core_init
b43_wireless_core_init: before ssb_bus_powerup
b43_wireless_core_init: before kzalloc: 1208
b43_wireless_core_init: before b43_phy_init_tssi2dbm_table
b43_wireless_core_init: after b43_phy_init_tssi2dbm_table: 0
b43_wireless_core_init: before b43_chip_init
b43-phy0: Loading firmware version 351.126 (2006-07-29 05:54:02)
b43-phy0 warning: You are using an old firmware image. Support for old firmware will be removed in July 2008.
b43-phy0 warning: You must go to http://linuxwireless.org/en/users/Drivers/b43#devicefirmware and download the latest firmware (version 4).
b43_wireless_core_init: after b43_chip_init: 0
b43_wireless_core_init: before b43_dma_init
b43_dma_init: before ssb_dma_set_mask
b43_setup_dmaring: after kzalloc1: ffff81004dfa3210
b43_setup_dmaring: after kzalloc2: ffff81004da94000
b43_setup_dmaring: after kcalloc: ffff81004d8ac000
b43_setup_dmaring: after dma_map_single: 00000000de092000
b43_dma_mapping_error: enter (ring=ffff81004dfa3210, type=30, addr=00000000de092000, buffersize=106)
b43_dma_mapping_error: after dma_mapping_error
b43_dma_mapping_error: in  B43_DMA_30BIT
b43_setup_dmaring: after ugh realloc kcalloc: ffff810000e0c000
b43_setup_dmaring: after dma_map_single 2: 00000000de093000
b43_dma_mapping_error: enter (ring=ffff81004dfa3210, type=30, addr=00000000de093000, buffersize=106)
b43_dma_mapping_error: after dma_mapping_error
b43_dma_mapping_error: in  B43_DMA_30BIT
b43_setup_dmaring: error mapping dma 2
b43_dma_init: after b43_setup_dmaring 0 1: 0000000000000000
b43_dma_init: out: -12
b43_wireless_core_init: after b43_dma_init: -12
b43_op_start: out: -12

So for the record, it fails with B43_DMA_30BIT, addr=00000000de093000, buffersize=106.

I cannot do anymore either (not a kernel/hardware developper). If I make this function return 0, I would be able to use my wireless network, or should I keep with 2.6.24 with this hardware?
Is it an architecture problem or a device specific problem? Device problems should be taken into account in drivers (it would be sad to maintain two drivers), shouldn't they?
Btw, I can help you test anything on this hardware if you need so.
Comment 10 Rafael J. Wysocki 2008-03-09 14:41:30 UTC
(In reply to comment #8)
> You're on your own with this.
> The function was introduced to fix another device.
> As I don't have either of these devices that break here, I can't help you.

Well, as far as I can say, the generally accepted policy is that we don't introduce changes fixing one system at the expense of breaking another one.  If the system used to work and the change breaks it, the change should be reverted.
Comment 11 Rafael J. Wysocki 2008-03-09 15:20:13 UTC
This entry is being used for tracking a regression from 2.6.24.  Please don't
close it until the problem is fixed in the mainline.

Handled-By : Michael Buesch <mb@bu3sch.de>
Comment 12 Michael Buesch 2008-03-09 16:51:23 UTC
I'm not sure which of your weird definitions apply here, but
we have one regression, that we fixed. Unfortunately this introduced another bug.
We are _NOT_ going to fix that by reverting the fix.

Unfortunately most architecture maintainers don't care to fix their broken DMA
implementations to honor the dma_mask. So I'm not sure what we should do here
in the driver. We are screwed. We've always been. I don't really see this as
a regression. It maybe worked by luck in the past. But this was _always_ broken.
Comment 13 Christian Casteyde 2008-03-14 11:57:55 UTC
Well, I'm not sure I understand everything there.
Basically, you say the driver should never have worked and there was another device that used to work with that driver since it was included in the kernel (assuming it was 2.6.23), then 2.6.24 broke it, then 2.6.25 fix it and broke the device I use? Which device is the most used?
Second, I looked at the dma.c diff, it seems there used to be a flag to disable DME within b43. This flag was removed, and if I understand well, it could have help broken driver to work. So ok, if the driver should have always failed, then I consider there is a regression by removing this config option, right?
Finally, I used to have a working driver (bcm43xx), if b43 (or whatever DMA code) cannot replace it, it seems to me important not to remove the old bcm43xx driver!!
It would be sad to go back to ndiswrapper now... From the user point of view, closing this bug is certainly not correct. For me if arch dma cannot be fixed, it becomes the driver problem, and if the driver cannot be fixed, then the previous driver should not be removed (then if it cannot be kept, the whole kernel 2.6.25+ branch will be simply unusable).
Please note I can understand the previous comment if the explication is simply "I won't/do not have the time to/do not want to fix it", but the bug shouldn't be closed in any case, the problem is still there and it is indeed **used to work** for my hardware.
Anyway, do you have any advice for me to keep using recent kernels and get wireless connectivity? I asked if I make the b43_dma_mapping_error returns 0, will it work (I could maintain a patch for myself)?
Comment 14 Michael Buesch 2008-03-14 12:18:04 UTC
(In reply to comment #13)
> I asked if I make the b43_dma_mapping_error returns 0,
> will it work (I could maintain a patch for myself)?

My magic crystalball is unfortunately broken, so I can not tell, if that would fix it for you.

Note that whining about whatever used to work and whatever has no effect, as I don't have these devices. So I can not fix it. So I am _forced_ to ignore this issue.
If you want it to be fixed, please tell me how to.
Comment 15 Christian Casteyde 2008-03-14 15:25:44 UTC
OK, I'll try. But I still think the bug is not completly analyzed.

That is, maybe the mask is not honored anymore for x86-64 and it still was in 2.6.24. That would therefore be a x86-64 regression in DMA mask or maybe in memory layout (and if true, that would be an important regression, because other drivers may also fails or fallback to a performance PIO regression).

I'll try to printk DMA addresses returned in 2.6.24, just to see if the results are acceptable DMA zone with this kernel.
Comment 16 Christian Casteyde 2008-03-15 00:36:18 UTC
Well, here are my findings.

First, effectively, the DMA addresses returned by dma_map_single is not the same at all with 2.6.25. 2.6.24 gives:
First DMA single: 0000000000000000
Second DMA single: 0000000000800000
First DMA single: 0000000000000000
Second DMA single: 0000000000a04000
First DMA single: 0000000000000000
Second DMA single: 000000000086c000
First DMA single: 0000000000000000
Second DMA single: 0000000000874000
First DMA single: 0000000000000000
Second DMA single: 0000000000a08000
First DMA single: 0000000000000000
Second DMA single: 0000000000a0c000
Which are below 30bits.

With 2.6.25, I get for instance:
First dma single: 00000000ddfb3000
Second dma single: 00000000ddfb4000
Which cause the driver to fail.

Then I read the DMA-API file in the documentation, and found:
"Note also that the above constraints on physical contiguity and
dma_mask may not apply if the platform has an IOMMU (a device which
supplies a physical to virtual mapping between the I/O memory bus and
the device).  However, to be portable, device driver writers may *not*
assume that such an IOMMU exists."

Therefore, I tried with 2.6.25, but with the boot option iommu=off.
I got:
First dma single: 0000000000000000
Second dma single: 0000000000e0c000
First dma single: 0000000000000000
Second dma single: 0000000000e28000
First dma single: 0000000000000000
Second dma single: 0000000000e74000
First dma single: 0000000000000000
Second dma single: 0000000000e20000
...

and the driver worked.
So, clearly, the IOMMU has been activated in 2.6.25, or its behaviour was changed, which broke this driver. This was the kind of thing I was expecting.

But, if I read well the documentation: "may not apply if the platform has an IOMMU " means for me that b43_dma_mapping_error should only use dma_mapping_error if the platform has an IOMMU that does DMA mask useless. As stated, this is still a driver error, because the driver **expects** DMA masks to be honored, and this is clearly not generally true.

The question therefore is: How can we check dma mappings in a moving environment:
- platform without IOMMU are supposedly honoring DMA masks ;
- platform with IOMMU **may** (but apparently not all) honor it.
This is not sufficient from DMA dev to say that, they also must export some information about IOMMU implementation: in particular if masks should be taken into account or not...

I'm not a kernel dev, and do not know how IOMMU works, but it may be wise to ask some arch people how all this should be done.

From what I saw, I guess that returning directly dma_mapping_error(addr) in b43_dma_mapping_error should work with IOMMU (otherwise, DMA code is also not IOMMU compliant). I'm going to test this ASAPP.

I suppose the driver is not compliant with DMA API, but this bug may really be affected to IOMMU/DMA also, since this seems to be an integration problem (we don't know how to check DMA mask). I'm therefore reopening this bug and reaffecting it to memory, because the situation is not clear enough for me to just say "there is no problem": there is a big problem on x86_64 with IOMMU + all drivers that check DMA masks (that is, drivers that are not plain stupid but also not wise enough).

I insist: it may not be a b43 problem, so closing it from the point of vue of b43 dev is not the right action. This is not whining, this is pure logic for me.
Comment 17 Christian Casteyde 2008-03-15 00:57:08 UTC
I've tested returning directly dma_mapping_error(addr) in b43_dma_mapping_error. It does not work (quite logical, since MMU addresses are not accepted by the device I guess). So, b43 driver cannot be fixed as is.

I restate the bug therefore:
2.6.25 has either activated IOMMU or changed its behaviour on at least x86_64, in such a way DMA masks are not taken into account, and this breaks drivers such as b43. Therefore either DMA is not usable anymore, either IOMMU has to be deactivated.
Comment 18 Christian Casteyde 2008-03-15 01:00:03 UTC
Well, I changed the bug title.
Please note also this may not be a driver/iommu regression (only a Kconfig one), if 2.6.24 Kconfig defaults to no IOMMU and 2.6.25 activates it (I didn't found the option to deactivates IOMMU except in boot options).
Comment 19 Christian Casteyde 2008-03-15 05:07:27 UTC
This is not a regression. IOMMU was forced by a debug option in kernel hacking.
Sorry for the inconvenience.
Comment 20 John W. Linville 2008-03-15 07:28:10 UTC
Hold on...I suspect this relates to the "better DMA allocator" topic written about here:

   http://lwn.net/Articles/272693/

It seems to me that if you need to use "iommu=off" now when you didn't need to before then there _is_ a regression -- just not necessarily one with b43.  AFAIK, turning-on IOMMU should not break a driver.

I'll have to refresh myself on the details of the DMA allocation in b43 before I can comment further.  Perhaps Michael and/or Andi (whom I couldn't add to the CC list as "ak@suse.de" seems to be unknown) can shed some light on the subject in the meantime.
Comment 21 Christian Casteyde 2008-03-15 08:53:40 UTC
No, it is not a regression from 2.6.24.

I only need to use "iommu=off" if I turn on IOMMU debugging in kernel hacking, in which I regularly activates some options to check -rc. This option was not activated in my 2.6.24 configuration, and indeed it also fails with 2.6.24 if I check it.

However, certainly IOMMU should not break drivers, sure, so this bug can be reopened. But this code is also marked as unsafe for production code, and IOMMU is not activated without this option on my hardware, so I thought it was an incompatible configuration. But actually I don't know.
Comment 22 Michael Buesch 2008-03-15 09:32:46 UTC
(In reply to comment #16)
> But, if I read well the documentation: "may not apply if the platform has an
> IOMMU " means for me that b43_dma_mapping_error should only use
> dma_mapping_error if the platform has an IOMMU that does DMA mask useless. As
> stated, this is still a driver error, because the driver **expects** DMA
> masks
> to be honored, and this is clearly not generally true.

Not at all. This is a _hardware_ _limitation_.
We can not write an address bigger than 30bit to the hardware, as the hardware register is only 30bit wide. Fullstop. That's the story.
We can not fix that in the driver. Please file a bugreport to your architecture.

> The question therefore is: How can we check dma mappings in a moving
> environment:

To say it again: The check is _not_ broken. The architecture, which ignores the DMA mask is broken. I do not care, if there's an IOMMU or whatever. I simply can not write the address that the kernel passed to me to the hardware.

> From what I saw, I guess that returning directly dma_mapping_error(addr) in
> b43_dma_mapping_error should work with IOMMU (otherwise, DMA code is also not
> IOMMU compliant). I'm going to test this ASAPP.

Won't do.
Comment 23 Andi Kleen 2008-03-15 13:13:47 UTC
> We can not fix that in the driver. Please file a bugreport to your
> architecture.

It is possible to work around such hw limits by bouncing through GFP_DMA (see the b44 driver which suffers from similar crippled hardware). The code is ugly of course, but ugly hardware requires ugly code. My DMA mask allocator patchset
which is currently pending review on linux-kernel will also offer a slightly 
better way to handle this in the future.

Since very devices suffer from this the pci dma IOMMU code is not the right
place to fix it, although once the mask allocator is in swiotlb on 64bit 
will be able to handle it. There is no swiotlb on 32bit so far though.

So for the forseeable future you'll need manual bouncing code.
Comment 24 Michael Buesch 2008-03-15 13:29:48 UTC
(In reply to comment #23)
> > We can not fix that in the driver. Please file a bugreport to your
> > architecture.
> 
> It is possible to work around such hw limits by bouncing through GFP_DMA

The code does this already, but it doesn't work on this architecture.

> Since very devices suffer from this the pci dma IOMMU code is not the right
> place to fix it, although once the mask allocator is in swiotlb on 64bit 
> will be able to handle it. There is no swiotlb on 32bit so far though.
> 
> So for the forseeable future you'll need manual bouncing code.

So why don't we remove that dma_mask stuff from struct device then? It does not work at all. It never worked. It's useless.

What's so hard about honoring that mask, _especially_ on an IOMMU?
Comment 25 Andi Kleen 2008-03-15 14:07:56 UTC
> The code does this already, but it doesn't work on this architecture.

Then the code is not correctly implemented. The b44 implementation
definitely works on x86-64 too. I haven't looked at it, but I assume
you rely on lowmem <= 30bit. That would already fail on a i386 kernel
compiled with a non standard page offset split. The right way is 
to bounce through GFP_DMA

> (flame snipped) 

Mask works just not how you think it does.

Real IOMMUs can honour all masks, but the AGP GART IOMMU and the swiotlb
used on x86-64 are not real iommus. They are rather "memory hole" implementations which only work for 4GB masks.
Comment 26 Michael Buesch 2008-03-15 14:15:46 UTC
(In reply to comment #25)
> > The code does this already, but it doesn't work on this architecture.
> 
> Then the code is not correctly implemented. The b44 implementation
> definitely works on x86-64 too. I haven't looked at it, but I assume
> you rely on lowmem <= 30bit. That would already fail on a i386 kernel
> compiled with a non standard page offset split. The right way is 
> to bounce through GFP_DMA

We bounce through GFP_DMA if we detect that the dma address we got from the kernel does not fit into the register (or descriptor). What can we do wrong here? I'm pretty sure b44 has the very same problem (as the DMA engine is exactly the same silicon). Did you try it with iommu enabled? I don't have a machine to test.

> > (flame snipped) 
> 
> Mask works just not how you think it does.
> 
> Real IOMMUs can honour all masks, but the AGP GART IOMMU and the swiotlb
> used on x86-64 are not real iommus. They are rather "memory hole"
> implementations which only work for 4GB masks.

Yeah, as I said. It's useless and doesn't do what I tell it to do. I request a 30bit mask, yet what I get can be a 32bit address.
Comment 27 Christian Casteyde 2008-03-16 00:57:02 UTC
I've looked to b44 code and there is slight differences with b43:
- b44 code uses dma_alloc_coherent in one place, without GFP_DMA, which seems to be a problem for two reasons:
   * first, one of its buffer seems to be a DMA buffer, and x86_64 IOMMU may not add automatically GFP_DMA to honour masks if it cannot bind this into a memory hole (to be checked);
   * second, the DMA_API explicitly states the reverse: GFP_DMA would be ignored if present on some architectures (but maybe the x86_64 does not, to be checked also);
- b44 code use __netdev_alloc_skb instead of kcalloc in b43;
- in case of failure after the first dma_map_single, b44 frees the DMA of the buffer before trying to bind the GFP_DMA alloced a second time, whereas b43 does not.

First point seems to be a b44 bug (as far as I can read and understand it).
Second point is probably irrelevant.
Third point may be the problem with b43: if the x86_64 IOMMU tries to give contiguous DMA areas for the same device (and I think this is indeed really a good idea, or even better the only wise thing to do), then the binding is ignored and the something very near is returned. I observed that:
First dma single: 00000000ddfb3000
Second dma single: 00000000ddfb4000
It returns the next page...

So maybe trying to free the previous dma mapping (this should be done anyway I think) before retrying would fix the problem.

For example:
                /* test for ability to dma to txhdr_cache */
                dma_test = dma_map_single(dev->dev->dev,
                                          ring->txhdr_cache,
                                          b43_txhdr_size(dev),
                                          DMA_TO_DEVICE);

                if (b43_dma_mapping_error(ring, dma_test, b43_txhdr_size(dev))) {
                        /* hugh: free DMA and realloc */
                        if (!dma_mapping_error(dma_test))
                            dma_unmap_single(dev->dev->dev,
                                          ring->txhdr_cache,
                                          b43_txhdr_size(dev),
                                          DMA_TO_DEVICE);
                        kfree(ring->txhdr_cache);
                        ring->txhdr_cache = kcalloc(nr_slots,
                                                    b43_txhdr_size(dev),
                                                    GFP_KERNEL | GFP_DMA);
                        if (!ring->txhdr_cache)
                                goto err_kfree_meta;

                        dma_test = dma_map_single(dev->dev->dev,
                                                  ring->txhdr_cache,
                                                  b43_txhdr_size(dev),
                                                  DMA_TO_DEVICE);

I could check on this function (other places should have to be fixed of course).
Comment 28 Christian Casteyde 2008-03-16 01:01:53 UTC
Also I don't know why dma_map_single would ignore the fact the second buffer to bind is not in the same place (trying to group DMA is one thing, but in this case this is an error even if the previous mapping is not freed).
I didn't checked that the second alloc is allocated before 30bits anyway, however if the GFP_DMA is ignored, that would be really critical...
Comment 29 Andi Kleen 2008-03-16 01:11:56 UTC
>   * first, one of its buffer seems to be a DMA buffer, and x86_64 IOMMU may
>not add automatically GFP_DMA to honour masks if it cannot bind this into a
>memory hole (to be checked);

x86-64 dma alloc coherent does add GFP_DMA as needed, it just cannot
be done for map_single

>   * second, the DMA_API explicitly states the reverse: GFP_DMA would be
>ignored if present on some architectures (but maybe the x86_64 does not, to be
>checked also);

For masks < 32bit GFP_DMA shouldn't be ignored on x86-64. However with
the mask allocator there are slightly better ways to do this than
using GFP_DMA, but that one is not merged yet.

> - b44 code use __netdev_alloc_skb instead of kcalloc in b43;

As long as GFP_DMA is passed that should be ok.

> - in case of failure after the first dma_map_single, b44 frees the DMA of the
> buffer before trying to bind the GFP_DMA alloced a second time, whereas b43
> does not.

Freeing is better, but unless it leaks it should be ok too to temporarily
hold more.
Comment 30 Christian Casteyde 2008-03-16 01:50:39 UTC
It leaks. The first dma reservation is lost when the second one is tried.
Michael: should I issue another bug on b43 for the leak (since this bug is not a b43 bug anymore)?

Anyway, I've just done the test, and it seems that GFP_DMA is taken into account by kcalloc, but dma_map_single does not do its job, as Michael previously said.

I'm running this code (with IOMMU activated, and with the leak fixed + 2 printks):

                ring->txhdr_cache = kcalloc(nr_slots,
                                            b43_txhdr_size(dev),
                                            GFP_KERNEL);
                if (!ring->txhdr_cache)
                        goto err_kfree_meta;

                /* test for ability to dma to txhdr_cache */
                dma_test = dma_map_single(dev->dev->dev,
                                          ring->txhdr_cache,
                                          b43_txhdr_size(dev),
                                          DMA_TO_DEVICE);
                printk(KERN_ERR "First alloc: %p - %p\n", ring->txhdr_cache, dma_test);

                if (b43_dma_mapping_error(ring, dma_test, b43_txhdr_size(dev))) {
                        /* ugh realloc */
                        if (!dma_mapping_error(dma_test))
                dma_unmap_single(dev->dev->dev,
                                 dma_test, b43_txhdr_size(dev),
                                 DMA_TO_DEVICE);
                        kfree(ring->txhdr_cache);
                        ring->txhdr_cache = kcalloc(nr_slots,
                                                    b43_txhdr_size(dev),
                                                    GFP_KERNEL | GFP_DMA);
                        if (!ring->txhdr_cache)
                                goto err_kfree_meta;

                       dma_test = dma_map_single(dev->dev->dev,
                                                  ring->txhdr_cache,
                                                  b43_txhdr_size(dev),
                                                  DMA_TO_DEVICE);
                printk(KERN_ERR "Second alloc: %p - %p\n", ring->txhdr_cache, dma_test);

                        if (b43_dma_mapping_error(ring, dma_test,
                                                  b43_txhdr_size(dev)))
                                goto err_kfree_txhdr_cache;
                }

                dma_unmap_single(dev->dev->dev,
                                 dma_test, b43_txhdr_size(dev),
                                 DMA_TO_DEVICE);

It prints:
First alloc: ffff81004e52c000 - 00000000dd1f1000
Second alloc: ffff810000808000 - 00000000dd1f2000

So the GFP_DMA flag moves the buffer to low memory(if I remove the high order bits, it's 0x808000), but the result of dma_map_single is still in high memory addresses. It should return the physical pages, which are effectively below 30 bits.
Comment 31 Michael Buesch 2008-03-16 03:06:08 UTC
(In reply to comment #30)
> It leaks. The first dma reservation is lost when the second one is tried.
> Michael: should I issue another bug on b43 for the leak (since this bug is
> not
> a b43 bug anymore)?

If you want it fixed, please send fixes.
Filing random bugreports will gain nothing, as I can not test this.
Comment 32 Christian Casteyde 2008-03-16 06:41:57 UTC
OK, I won't fill a bug. I thought it could help to track the problem.

The fix is already available, it's in comment #30. Simply remove printks and you have it.

Maybe there are other areas were dma_unmap_single should be called before doing a second try. That's quite easy to find, look for b43_dma_mapping_error. But I won't do that job, since I'm not the maintainer of this driver, and willing to stay a regular user with testing skills.
Comment 33 Christian Casteyde 2008-03-16 06:46:49 UTC
Btw, there is another way to leak, event in the snippet of comment #30: if the second b43_dma_mapping_error also fails due to adress check, the dma_mapping is done in the DMA_API, and yet not freed either.
Comment 34 Rafael J. Wysocki 2008-03-16 11:04:16 UTC
(In reply to comment #32)
> OK, I won't fill a bug. I thought it could help to track the problem.
> 
> The fix is already available, it's in comment #30. Simply remove printks and
> you have it.

Can you please prepare a clean patch, with a changelog and signed off by you, and post it to the LKML with CCs to Michael and John Linville?
Comment 35 Christian Casteyde 2008-03-16 16:43:48 UTC
I'll try to fix the leak then the next week-end (eh, maybe not, it's easter...). But I don't know what "signing off" means. I'll also certainly mail privatly, since I don't want to register to LKML for just one patch.
Comment 36 John W. Linville 2008-03-17 13:56:18 UTC
Christian, please look here:

   http://linux.yyz.us/patch-format.html

Also is is not necessary to "register to LKML".  Just email the patch to linux-wireless@vger.kernel.org and CC Michael and me.  We are much more accustomed to reviewing patches on the list anyway.  Thanks!
Comment 37 Alan 2009-03-23 10:11:42 UTC
Closing old stale bug

Note You need to log in before you can comment on or make changes to this bug.