Bug 209163
Summary: | amdgpu: The CS has been cancelled because the context is lost | ||
---|---|---|---|
Product: | Memory Management | Reporter: | Satish patel (satish.in) |
Component: | Other | Assignee: | drivers_video-dri |
Status: | NEW --- | ||
Severity: | high | CC: | alexdeucher, christian.koenig, satish.in |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 4.9.118 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg log
AMDGPU version information Mesa_opencl version information lspci information VRAM Utilization screen shot |
Created attachment 292357 [details]
AMDGPU version information
Created attachment 292359 [details]
Mesa_opencl version information
Created attachment 292361 [details]
lspci information
This is expected behavior, your application tries to use more memory than physical available: [71804.930003] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Not enough memory for command submission! That is most likely a bug in the application, e.g. a memory leak. (In reply to Christian König from comment #4) > This is expected behavior, your application tries to use more memory than > physical available: > > [71804.930003] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Not enough memory for > command submission! > > That is most likely a bug in the application, e.g. a memory leak. Dear Mr. Konig, Thanks for your reply , But I would like to inform and describe same application running up to 10 days until Physical memory and swap memory not utilized in CentOS 7 (gnome display ) with kernel 3.10.0-1127.el7.x86_64. But same application has error "amdgpu: The CS has been cancelled because the context is lost" even system utilize only 75% physical memory from Total 5.83 GB Physical memory and 1% swap memory from 15 GB swap partition. This Error , I am getting in Kernel 4.9.118. Why system crash ( Display flickering and touch screen not responding) and not utilize swap memory area ? . But CPU and memory utilization showing when monitoring from other system . You are running out of VRAM, not system memory. Can you test this on an up to date kernel as well? Created attachment 292449 [details]
VRAM Utilization screen shot
It's attached VRAM Utilization error screen shot as output of - cat /sys/kernel/debug/dri/0/amdgpu_vram_mm
(In reply to Christian König from comment #6) > You are running out of VRAM, not system memory. > > Can you test this on an up to date kernel as well? Is there any way to restrict not utilize full VRAM by AMDGPU module parameter settings ? same application running with on same hardware in Gnome desktop (Centos 7) with kernel 3.10.xx.1127 . I am getting error when Utilize same application in X Windows and getting error after 19 hours. where same application running more than 7 days with above Operating system and kernel version. Try amdgpu.vramlimit=512 on the kernel command line to limit the available VRAM to 512MB. The problem is certainly some kind of memory leak. You need to test an up to date kernel, like 5.8 or even better the latest bleeding edge amd-staging-drm-next branch. |
Created attachment 292355 [details] dmesg log I am getting error after playing application continuously .