Bug 212077
Summary: | AMD GPU discrete card memory at highest frequency even while not in use | ||
---|---|---|---|
Product: | Drivers | Reporter: | Bat Malin (bat_malin) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED CODE_FIX | ||
Severity: | high | CC: | alexdeucher |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.11.3 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Dmesg
Picture of memory status Picture of memory status (new) Dmesg (new) possible fix |
Created attachment 295679 [details]
Picture of memory status
Should be fixed with this patch: https://patchwork.freedesktop.org/patch/422999/ Thank you Alex! Issue not fixed in kernel 5.11.4 Issue still present in 5.11.5 1.335057] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. No change in the code of 5.12-rc2... for (i = 0; i < dep_mclk_table->count; i++) { for (j = 0; j < dep_sclk_table->count; j++) { valid_entry = false; for (k = 0; k < watermarks->num_wm_sets; k++) { if (dep_sclk_table->entries[i].clk / 10 >= watermarks->wm_clk_ranges[k].wm_min_eng_clk_in_khz && dep_sclk_table->entries[i].clk / 10 < watermarks->wm_clk_ranges[k].wm_max_eng_clk_in_khz && dep_mclk_table->entries[i].clk / 10 >= watermarks->wm_clk_ranges[k].wm_min_mem_clk_in_khz && dep_mclk_table->entries[i].clk / 10 < watermarks->wm_clk_ranges[k].wm_max_mem_clk_in_khz) { valid_entry = true; table->DisplayWatermark[i][j] = watermarks->wm_clk_ranges[k].wm_set_id; break; Code not fixed in 5.11.6 Code fixed in 5.11.7 Thank you! Code fixed but the GPU is still running @highest possible clock Created attachment 295905 [details]
Picture of memory status (new)
Created attachment 295907 [details]
Dmesg (new)
Old Kernel e.g. 5.10.23 initializes this 1.038643] [drm] DM_PPLIB: values for Engine clock [ 1.038645] [drm] DM_PPLIB: 214000 [ 1.038646] [drm] DM_PPLIB: 603000 [ 1.038646] [drm] DM_PPLIB: 958000 [ 1.038647] [drm] DM_PPLIB: 1060000 [ 1.038647] [drm] DM_PPLIB: 1128000 [ 1.038647] [drm] DM_PPLIB: 1182000 [ 1.038648] [drm] DM_PPLIB: 1230000 [ 1.038648] [drm] DM_PPLIB: 1275000 [ 1.038649] [drm] DM_PPLIB: Validation clocks: [ 1.038649] [drm] DM_PPLIB: engine_max_clock: 127500 [ 1.038649] [drm] DM_PPLIB: memory_max_clock: 175000 [ 1.038650] [drm] DM_PPLIB: level : 8 [ 1.038651] [drm] DM_PPLIB: values for Memory clock [ 1.038651] [drm] DM_PPLIB: 300000 [ 1.038651] [drm] DM_PPLIB: 625000 [ 1.038652] [drm] DM_PPLIB: 1750000 [ 1.038652] [drm] DM_PPLIB: Validation clocks: [ 1.038652] [drm] DM_PPLIB: engine_max_clock: 127500 [ 1.038653] [drm] DM_PPLIB: memory_max_clock: 175000 [ 1.038653] [drm] DM_PPLIB: level : 8 and for the integrated card- [ 1.469248] [drm] DM_PPLIB: values for F clock [ 1.469250] [drm] DM_PPLIB: 400000 in kHz, 2874 in mV [ 1.469251] [drm] DM_PPLIB: 933000 in kHz, 3224 in mV [ 1.469252] [drm] DM_PPLIB: 1067000 in kHz, 3924 in mV [ 1.469253] [drm] DM_PPLIB: 1200000 in kHz, 4074 in mV [ 1.469256] [drm] DM_PPLIB: values for DCF clock [ 1.469257] [drm] DM_PPLIB: 300000 in kHz, 2874 in mV [ 1.469258] [drm] DM_PPLIB: 600000 in kHz, 3224 in mV [ 1.469259] [drm] DM_PPLIB: 626000 in kHz, 3924 in mV [ 1.469260] [drm] DM_PPLIB: 654000 in kHz, 4074 in mV [ 1.469553] [drm] Display Core initialized with v3.2.104! The new one 5.11.7 only for integrated card [ 1.992374] kernel: [drm] DM_PPLIB: values for F clock [ 1.992377] kernel: [drm] DM_PPLIB: 400000 in kHz, 2874 in mV [ 1.992379] kernel: [drm] DM_PPLIB: 933000 in kHz, 3224 in mV [ 1.992381] kernel: [drm] DM_PPLIB: 1067000 in kHz, 3924 in mV [ 1.992382] kernel: [drm] DM_PPLIB: 1200000 in kHz, 4074 in mV [ 1.992385] kernel: [drm] DM_PPLIB: values for DCF clock [ 1.992387] kernel: [drm] DM_PPLIB: 300000 in kHz, 2874 in mV [ 1.992388] kernel: [drm] DM_PPLIB: 600000 in kHz, 3224 in mV [ 1.992390] kernel: [drm] DM_PPLIB: 626000 in kHz, 3924 in mV [ 1.992391] kernel: [drm] DM_PPLIB: 654000 in kHz, 4074 in mV So I think this is related as the new kernel driver can`t initialize the values for the discrete card. Please fix. Created attachment 296035 [details]
possible fix
This patch should fix it.
Thank you Alex for your engagement! Could you please include the patch in the next 5.11.11 release so I could test the patch, sorry but I am not allowed to compile a kernel on this machine. Issue fixed in 5.11.12 even now it consumes less power (~1,07W less). Before: amdgpu-pci-0100 Adapter: PCI adapter vddgfx: 756.00 mV edge: +35.0 C (crit = +94.0 C, hyst = -273.1 C) power1: 8.14 W (cap = 60.00 W) After: amdgpu-pci-0100 Adapter: PCI adapter vddgfx: 756.00 mV edge: +38.0°C (crit = +94.0°C, hyst = -273.1°C) power1: 7.07 W (cap = 60.00 W) Thank you! After reboot even better - amdgpu-pci-0100 Adapter: PCI adapter vddgfx: 756.00 mV edge: +35.0°C (crit = +94.0°C, hyst = -273.1°C) power1: 6.22 W (cap = 60.00 W) |
Created attachment 295677 [details] Dmesg 1.240847] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. [ 1.240850] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. [ 1.240851] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. [ 1.240852] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. [ 1.240853] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. [ 1.240854] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. [ 1.240855] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. [ 1.240856] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. [ 1.240857] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. [ 1.240858] amdgpu: Clock is not in range of specified clock range for watermark from DAL! Using highest water mark set. Dmesg attached