Bug 211117 - dma_mmap_coherent() - garbage sound output with mem_encrypt on Ryzen platform
Summary: dma_mmap_coherent() - garbage sound output with mem_encrypt on Ryzen platform
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Platform_x86 (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: drivers_platform_x86@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-10 16:28 UTC by Triton Circonflexe
Modified: 2021-01-20 12:47 UTC (History)
3 users (show)

See Also:
Kernel Version: 5.10.6
Subsystem:
Regression: No
Bisected commit-id:


Attachments
.asoundrc configuration file (474 bytes, text/plain)
2021-01-10 16:28 UTC, Triton Circonflexe
Details
Output of alsa-info.sh (52.86 KB, text/plain)
2021-01-17 14:09 UTC, Triton Circonflexe
Details

Description Triton Circonflexe 2021-01-10 16:28:30 UTC
Created attachment 294595 [details]
.asoundrc configuration file

When mem_encrypt is activated on a Ryzen 3600, the virtuoso driver produces only white noise with standard configuration on a Xonar D2X sound card.

Kernel version:
- First observed with 5.4.60 with Gentoo patchset (gentoo-sources 5.4.60)
- Last tests executed with Vanilla 5.10.6 (vanilla-sources 5.10.6)

$ uname -a
Linux kobayashi 5.10.6-Kobayashi #1 SMP Sun Jan 10 11:27:58 CET 2021 x86_64 AMD Ryzen 5 3600 6-Core Processor AuthenticAMD GNU/Linux

ALSA plugins version: 1.2.2

Hardware:
- Asus Xonar D2X: virtuoso driver, where the problem happens
- Integrated Reatek ALC1200A (Asus TUF Gaming B550M Plus): hda-intel driver, additional tests for reference

Using default configuration from Gentoo distribution (which looks pretty standard), I get only white noise, strictly equivalent to the execution of this command:
aplay -t raw -c 1 -r 48000 -f S16_LE /dev/random

Not being able to understand the complete standard configuration, I tried to create one from the ground up without success regarding my final goal (something just working for all applications, including when several are running concurrently) but I was able to create a minimal configuration which, AFAIK is not impacted by the system configuration.

- pcm_slave.<device>_slave targets the hardware ("hw:D2X,0" or "hw:Generic,0")
- pcm.copied_<device> copies the output to pcm_slave.<device>_slave and a file.
- pcm.routed_<device> routes the FL and FR input channels to the FL and FR channels of pcm.copied_<device> (useless but my initial intent was to swap some of the channels in 5.1 and 7.1 setups)

The output to file was intended to show the random output but this is a part that works as expected (no difference between tests).

Executed tests (CONFIG_CMDLINE="root=PARTUUID=54cffe4d-bed1-9a49-adc9-d7aef941e1b7 radeon.modeset=1"):
$ speaker-test -t sine -f 375 -c 2 -Drouted_xonar # White noise
$ speaker-test -t sine -f 375 -c 2 -Dcopied_xonar # 375 Hz sine
$ speaker-test -t sine -f 375 -c 2 -Drouted_realtek # 375 Hz sine
$ speaker-test -t sine -f 375 -c 2 -Dcopied_realtek # 375 Hz sine

First observation: it seems specific to the virtuoso driver since the embedded Realtek does not show the problem.

Then, I rebooted with the additional flag mem_encrypt=off and all 4 tests were successful.

Note: once again, it was luck, the mem_encrypt=off test was intended for my virtualbox issues:
- https://bugs.archlinux.org/task/68541
- https://www.virtualbox.org/ticket/20021
- https://www.virtualbox.org/ticket/20084

Do not hesitate to tell if anything else is required (complete .config, for instance) or if you think I should still open the ticket on alsa-project side.

And if you need me to test, I will gladly do it (although I cannot guaranty it will be done on the same day, I will follow the subject closely).
Comment 1 Takashi Iwai 2021-01-12 17:44:49 UTC
Do you have the issue if you play without alsa-lib plugins, i.e. with aplay -Dhw:0 or such?  And is this bug specific to the oxygen driver, or it happens on other devices like USB-audio?  I'm wondering which part is problematic with SEV.
Comment 2 Triton Circonflexe 2021-01-12 18:08:31 UTC
Direct play to "hw:D2X,0" works (the copy was just to try to capture the result but the result is the same without it).
-> Meaning it's probably a plugin issue.


And as explained, the exact same configuration on an integrated Realtek ALC 1200A (hda-intel driver) works perfectly fine.
-> Meaning it's probably a driver issue.

So honestly, I'm quite at loss here. :-/
Comment 3 Takashi Iwai 2021-01-13 09:00:31 UTC
Hm, that's quite puzzling.

Does the behavior change if you call dma_set_mask_and_coherent() with DMA_BIT_MASK(32) explicitly in the oxygen driver?
Comment 4 Triton Circonflexe 2021-01-13 13:04:39 UTC
I'm willing to test but I have no idea how/where to do that (developer but not a kernel driver dev), could you explain?
Comment 5 Takashi Iwai 2021-01-13 14:35:05 UTC
It's something like:
--- a/sound/pci/oxygen/oxygen_lib.c
+++ b/sound/pci/oxygen/oxygen_lib.c
@@ -650,6 +650,8 @@ int oxygen_pci_probe(struct pci_dev *pci, int index, char *id,
 	pci_set_master(pci);
 	card->private_free = oxygen_card_free;
 
+	dma_set_mask_and_coherent(&pci->dev, DMA_BIT_MASK(32));
+
 	configure_pcie_bridge(pci);
 	oxygen_init(chip);
 	chip->model.init(chip);


I don't expect much by this change, so just to be sure...

Other than that, check the contents of /proc/asound/card*/pcm0p/sub0/hw_params while playing back different cases (with working device, with non-working device, with plugins, etc).  Especially check if it's running in the mmap access mode or not.  e.g. you can run aplay with -M option to enforce the mmap mode for -Dhw:*, too.
Comment 6 Triton Circonflexe 2021-01-16 14:27:14 UTC
Sorry for the delay, here are the test results:

With or without the patch, always the same result (what was working is still working and vice-versa).

Now, regarding the content of /proc/asound/card*/pcm0p/sub0/hw_params

Note: card0 = D2X/virtuoso and card1 = Generic/hda-intel.

The content of the settings for any given test is entirely identical with or without the mem_encrypt=off boot option.

The content of /proc/asound/card0/pcm0p/sub0/hw_params for a test with hw:0,0 is very similar to the content of /proc/asound/card1/pcm0p/sub0/hw_params for the same test with hw:1,0:
- period_size and buffer_size vary but buffer_size/period_size = 4.0 in all cases.
- everything else is identical, beginning with access.

Probably no surprise here, but tested anyway.

Regarding the tests with speaker-test:
- routed_* shows MMAP_INTERLEAVED
- copied_* shows RW_INTERLEAVED (as well as hw:*,0)

Surely enough, the tests with aplay are consistent with the request:
- aplay -D"hw:$card,0" shows RW_INTERLEAVED and works
- aplay -M -D"hw:$card,0" shows MMAP_INTERLEAVED and outputs white noise on D2X

Additional tests with the default system configuration alsa shows MMAP_INTERLEAVED which is once again consistent.
Comment 7 Takashi Iwai 2021-01-17 08:10:04 UTC
Thanks.  It's clearer now that it's the mmap that causes the problem.

Could you give alsa-info.sh output for further verification?  Run the script with --no-upload option and attach the output to Bugzilla.

And, now let's try other way round; modify HD-audio driver and see whether the issue happens with the mmap mode.  The patch below enforces the 32bit DMA mask and the same buffer allocation type without S/G for HD-audio:

--- a/sound/pci/hda/hda_controller.c
+++ b/sound/pci/hda/hda_controller.c
@@ -720,7 +720,7 @@ int snd_hda_attach_pcm_stream(struct hda_bus *_bus, struct hda_codec *codec,
        int pcm_dev = cpcm->device;
        unsigned int size;
        int s, err;
-       int type = SNDRV_DMA_TYPE_DEV_SG;
+       int type = SNDRV_DMA_TYPE_DEV;
 
        list_for_each_entry(apcm, &chip->pcm_list, list) {
                if (apcm->pcm->device == pcm_dev) {
@@ -759,8 +759,6 @@ int snd_hda_attach_pcm_stream(struct hda_bus *_bus, struct hda_codec *codec,
        size = CONFIG_SND_HDA_PREALLOC_SIZE * 1024;
        if (size > MAX_PREALLOC_SIZE)
                size = MAX_PREALLOC_SIZE;
-       if (chip->uc_buffer)
-               type = SNDRV_DMA_TYPE_DEV_UC_SG;
        snd_pcm_set_managed_buffer_all(pcm, type, chip->card->dev,
                                       size, MAX_PREALLOC_SIZE);
        return 0;
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
index 8d568277088a..bcf3b0cadf5c 100644
--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -1967,6 +1967,7 @@ static int azx_first_init(struct azx *chip)
        /* allow 64bit DMA address if supported by H/W */
        if (!(gcap & AZX_GCAP_64OK))
                dma_bits = 32;
+       dma_bits = 32;
        if (!dma_set_mask(&pci->dev, DMA_BIT_MASK(dma_bits))) {
                dma_set_coherent_mask(&pci->dev, DMA_BIT_MASK(dma_bits));
        } else {
Comment 8 Triton Circonflexe 2021-01-17 14:09:06 UTC
Created attachment 294699 [details]
Output of alsa-info.sh

Good news... well, sorta: the sound is now equally horrible on hda-intel with mmap activated.

Note: I had a few error messages on alsa-info.sh execution:
cat: /proc/asound/modules: No such file or directory
cat: /sys/class/sound/hwC1D0/user_pin_configs: No such file or directory
cat: /sys/class/sound/hwC1D0/init_verbs: No such file or directory
cat: /sys/class/sound/hwC1D0/hints: No such file or directory
cat: /sys/class/sound/hwC2D0/user_pin_configs: No such file or directory
cat: /sys/class/sound/hwC2D0/init_verbs: No such file or directory
cat: /sys/class/sound/hwC2D0/hints: No such file or directory

One thing I forgot to mention is that the virtuoso driver is built in-kernel while the hda-intel is built as a module because at first, I didn't intend to build the hda-intel at all since I did not think I would use the embedded chip.
Comment 9 Takashi Iwai 2021-01-18 07:40:16 UTC
OK, then could you check which change causes the problem on snd-hda-intel: the change in hda_intel.c (setting dma_mask) or in hda-controller.c (setting SNDRV_DMA_TYPE_*)?
Comment 10 Triton Circonflexe 2021-01-18 12:59:11 UTC
I can.

- With only the hda_intel.c (dma_bits = 32) part of the patch, I hear music.

- With only the hda_controller.c (type = SNDRV_DMA_TYPE_DEV) part of the patch, I hear white noise.
Comment 11 Takashi Iwai 2021-01-18 13:05:04 UTC
OK, thanks.  Then, could you try the change below together with SNDRV_DMA_TYPE_DEV change?  Does it work now?

--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3713,7 +3713,7 @@ int snd_pcm_lib_default_mmap(struct snd_pcm_substream *substream,
 				area->vm_end - area->vm_start, area->vm_page_prot);
 	}
 #endif /* CONFIG_GENERIC_ALLOCATOR */
-	if (IS_ENABLED(CONFIG_HAS_DMA) && !substream->ops->page &&
+	if (0 && IS_ENABLED(CONFIG_HAS_DMA) && !substream->ops->page &&
 	    (substream->dma_buffer.dev.type == SNDRV_DMA_TYPE_DEV ||
 	     substream->dma_buffer.dev.type == SNDRV_DMA_TYPE_DEV_UC))
 		return dma_mmap_coherent(substream->dma_buffer.dev.dev,
Comment 12 Triton Circonflexe 2021-01-18 14:03:45 UTC
With hda_controller.c AND pcm_native.c patches, the Generic/hda-intel device is working as expected.

pcm_native.c being obviously a common file, I also tested the D2X/virtuoso device and it works too.
Comment 13 Takashi Iwai 2021-01-18 14:16:37 UTC
So, the test result implies that dma_mmap_coherent() is broken with the memory encryption while the mmap with the page fault still works.
Comment 14 Triton Circonflexe 2021-01-18 14:19:09 UTC
I'll trust you on that. :-)

In any case, I stay available for further tests if needed.
Comment 15 Jaroslav Kysela 2021-01-18 14:27:20 UTC
It looks like an issue which is not related directly to audio, right?

Reassigning to Platform_X86 .
Comment 16 Takashi Iwai 2021-01-18 14:36:09 UTC
Yes, this must be a generic issue.  I'll ask Christoph at first.
Comment 17 thomas.lendacky 2021-01-19 19:33:00 UTC
This sounds like a mismatch between the encryption bit in the kernel and the encryption bit in userspace. It looks like that should be taken care of by the dma_pgprot() call in dma_mmap_attrs() or in iommu_dma_mmap(). But maybe the force_dma_unencrypted() in arch/x86/mm/mem_encrypt.c needs to understand if the IOMMU is doing the mapping. Since, even if the device doesn't support 48-bit or higher DMA, it will still done encrypted because of the IOMMU. I don't see any dmesg output, is the IOMMU enabled? What happens if you do iommu=pt on the kernel command line?

Alternatively, if you want memory encryption for your bare-metal system, you can see if the BIOS supports TSME (Transparent SME). Then you can remove mem_encrypt=on or add mem_encrypt=off (if enabled by default), and still get memory encryption.
Comment 18 Triton Circonflexe 2021-01-20 12:47:01 UTC
Tested with iommu=pt (and no mem_encrypt option, so "on" by default): still white noise.

Regarding the encryption, I don't really have a _need_ for that. If I use it, it's in part because Virtualbox seems to require it in some use cases (not even entirely sure about that) and mostly because it's activated by default (and I like default settings to work correctly).
I checked the BIOS anyway but did not find any TSME or similar option (thanks for the information, though).

Right now, I'm working with mem_encrypt=off and I can do everything I need for now (no vbox for the moment).
I will continue to follow this bug because I can reproduce the problem easily and do tests if required.

Note You need to log in before you can comment on or make changes to this bug.