Bug 36072

Summary: celestia causes kernel oops when allocation a lot of memory (for textures)
Product: Drivers Reporter: aceman (acelists)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED OBSOLETE    
Severity: high CC: alan, alexdeucher, thellstrom
Priority: P1    
Hardware: i386   
OS: Linux   
Kernel Version: 3.5.3 Subsystem:
Regression: Yes Bisected commit-id:

Description aceman 2011-05-28 10:52:23 UTC
The celestia program often allocates a lot of memory (like 1.5GB on my 2GB machine with 3GB swap) for its textures. I don't know where it stores them (probably not Video memory), but it seems in system RAM, because swap is used much. Sometimes when this happens the kernel crashes. After that I can only sync and reboot the machine with alt-sysrq commands. I have only noticed this with kernel 2.6.38.x, not before. I was using the OSS ati driver 6.14.1, with mesa 7.10.2. All including the kernel seft compiled (for AMD fam10h CPU). I do not understand the first line of the log. Where can I increase the vmalloc size. Is the on the kernel command line? But it should not crash the kernel in any case, just the program.

Here is the kernel log:
May 19 21:26:52 coolbox kernel: vmap allocation for size 178982912 failed: use vmalloc=<size> to increase size.
May 19 21:26:52 coolbox kernel: BUG: unable to handle kernel paging request at b892a45d
May 19 21:26:52 coolbox kernel: IP: [<f9696a50>] ttm_mem_io_lock+0x0/0x20 [ttm]
May 19 21:26:52 coolbox kernel: *pde = 00000000 
May 19 21:26:52 coolbox kernel: Oops: 0000 [#1] PREEMPT SMP 
May 19 21:26:52 coolbox kernel: last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0A03:00/device:01/ATK0110:00/hwmon/hwmon0/fan3_input
May 19 21:26:52 coolbox kernel: Modules linked in: fbcon font bitblit softcursor radeon ttm drm_kms_helper drm agpgart fb fbdev autofs4 cfbcopyarea cfbimgblt cfbfillrect nf_conntrack_ftp xt_tcpudp xt_owner xt_multiport nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT ipt_LOG iptable_filter ip_tables x_tables asus_atk0110 snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss cpufreq_conservative cpufreq_ondemand psmouse pcspkr cx88_blackbird firmware_class cx2341x cx8802 tuner_simple tuner_types tda9887 tda8290 tea5767 tuner cx8800 cx88xx rc_core i2c_algo_bit tveeprom v4l2_common videodev btcx_risc videobuf_dma_sg videobuf_core forcedeth snd_hda_codec_via snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc i2c_nforce2 i2c_core ext4 mbcache jbd2 crc16 usbhid powernow_k8 processor mperf ohci_hcd ehci_hcd usbcore fuse
May 19 21:26:52 coolbox kernel: 
May 19 21:26:52 coolbox kernel: Pid: 25014, comm: celestia Not tainted 2.6.38.6 #57 System manufacturer System Product Name/M2N68
May 19 21:26:52 coolbox kernel: EIP: 0060:[<f9696a50>] EFLAGS: 00010246 CPU: 3
May 19 21:26:52 coolbox kernel: EIP is at ttm_mem_io_lock+0x0/0x20 [ttm]
May 19 21:26:52 coolbox kernel: EAX: b892a420 EBX: d03cfc28 ECX: 00000000 EDX: 00000000
May 19 21:26:52 coolbox kernel: ESI: f519a40c EDI: b892a420 EBP: d03cfd48 ESP: d03cfbf0
May 19 21:26:52 coolbox kernel:  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
May 19 21:26:52 coolbox kernel: Process celestia (pid: 25014, ti=d03ce000 task=f06fb830 task.ti=d03ce000)
May 19 21:26:52 coolbox kernel: Stack:
May 19 21:26:52 coolbox kernel:  f9696f25 d03cfc28 d03cfd80 d03cfc98 f96970e2 f29e2c2c fffffff4 00040002
May 19 21:26:52 coolbox kernel:  f519a40c f9695c16 f29e2c64 f50135c0 00040001 00000002 00000002 d03cfcd0
May 19 21:26:52 coolbox kernel:  f519a45c f519a40c 00000000 af3e5000 d03cfc48 d03cfc9c f519a40c d03cfc98
May 19 21:26:52 coolbox kernel: Call Trace:
May 19 21:26:52 coolbox kernel:  [<f9696f25>] ? ttm_mem_reg_iounmap+0x35/0x70 [ttm]
May 19 21:26:52 coolbox kernel:  [<f96970e2>] ? ttm_bo_move_memcpy+0x182/0x310 [ttm]
May 19 21:26:52 coolbox kernel:  [<f9695c16>] ? ttm_bo_mem_space+0x306/0x3a0 [ttm]
May 19 21:26:52 coolbox kernel:  [<f97f4fb0>] ? radeon_bo_move+0xe0/0x330 [radeon]
May 19 21:26:52 coolbox kernel:  [<f96944d5>] ? ttm_bo_reserve_locked+0xa5/0x120 [ttm]
May 19 21:26:52 coolbox kernel:  [<f97f4ed0>] ? radeon_bo_move+0x0/0x330 [radeon]
May 19 21:26:52 coolbox kernel:  [<f9694e15>] ? ttm_bo_handle_move_mem+0x135/0x330 [ttm]
May 19 21:26:52 coolbox kernel:  [<f9695ddc>] ? ttm_bo_move_buffer+0x12c/0x140 [ttm]
May 19 21:26:52 coolbox kernel:  [<f9695e86>] ? ttm_bo_validate+0x96/0x120 [ttm]
May 19 21:26:52 coolbox kernel:  [<f97f5f5a>] ? radeon_bo_list_validate+0x5a/0xe0 [radeon]
May 19 21:26:52 coolbox kernel:  [<f980d14c>] ? radeon_cs_ioctl+0x7c/0x1a0 [radeon]
May 19 21:26:52 coolbox kernel:  [<c0206466>] ? prepare_for_delete_or_cut+0x3c6/0x650
May 19 21:26:52 coolbox kernel:  [<f96b3b51>] ? drm_ioctl+0x191/0x380 [drm]
May 19 21:26:52 coolbox kernel:  [<c0206466>] ? prepare_for_delete_or_cut+0x3c6/0x650
May 19 21:26:52 coolbox kernel:  [<f980d0d0>] ? radeon_cs_ioctl+0x0/0x1a0 [radeon]
May 19 21:26:52 coolbox kernel:  [<c018ccf8>] ? handle_pte_fault+0x88/0x630
May 19 21:26:52 coolbox kernel:  [<c026006f>] ? prio_tree_insert+0x12f/0x250
May 19 21:26:52 coolbox kernel:  [<f96b39c0>] ? drm_ioctl+0x0/0x380 [drm]
May 19 21:26:52 coolbox kernel:  [<c01b1edf>] ? do_vfs_ioctl+0x7f/0x590
May 19 21:26:52 coolbox kernel:  [<c011df35>] ? do_page_fault+0x185/0x3a0
May 19 21:26:52 coolbox kernel:  [<c0190bae>] ? mmap_region+0x16e/0x440
May 19 21:26:52 coolbox kernel:  [<c01351d5>] ? irq_exit+0x35/0x70
May 19 21:26:52 coolbox kernel:  [<c01b242d>] ? sys_ioctl+0x3d/0x70
May 19 21:26:52 coolbox kernel:  [<c0206466>] ? prepare_for_delete_or_cut+0x3c6/0x650
May 19 21:26:52 coolbox kernel:  [<c03979a1>] ? syscall_call+0x7/0xb
May 19 21:26:52 coolbox kernel:  [<c0206466>] ? prepare_for_delete_or_cut+0x3c6/0x650
May 19 21:26:52 coolbox kernel:  [<c0206466>] ? prepare_for_delete_or_cut+0x3c6/0x650
May 19 21:26:52 coolbox kernel: Code: 00 00 00 66 31 c0 83 c8 01 89 47 50 eb 9f 90 8d 74 26 00 89 da 89 f0 e8 0f cb ff ff 85 c0 74 a4 89 c5 eb b2 8d b4 26 00 00 00 00 <80> 78 3d 00 74 03 31 c0 c3 84 d2 75 0a 83 c0 28 e8 db f5 cf c6 
May 19 21:26:52 coolbox kernel: EIP: [<f9696a50>] ttm_mem_io_lock+0x0/0x20 [ttm] SS:ESP 0068:d03cfbf0
May 19 21:26:52 coolbox kernel: CR2: 00000000b892a45d
May 19 21:26:52 coolbox kernel: ---[ end trace 70c887d309b00b2d ]---
May 19 21:27:18 coolbox kernel: Emergency Sync complete
May 19 21:27:21 coolbox kernel: Emergency Sync complete
Comment 1 aceman 2011-05-28 10:53:18 UTC
I am also using the transparent hugepages feature with is set to 'always' use them.
Comment 2 aceman 2011-08-30 18:15:47 UTC
This is getting worse. With kernel 3.0.3, mesa 7.11 I get the crash several seconds after starting celestia each time.

Aug 30 00:01:03 coolbox kernel: [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!
Aug 30 00:01:03 coolbox kernel: vmap allocation for size 178982912 failed: use vmalloc=<size> to increase size.
Aug 30 00:01:03 coolbox kernel: BUG: unable to handle kernel paging request at 9adc8491
Aug 30 00:01:03 coolbox kernel: IP: [<f95daad0>] ttm_bo_move_ttm+0xa0/0xa0 [ttm]
Aug 30 00:01:03 coolbox kernel: *pde = 00000000 
Aug 30 00:01:03 coolbox kernel: Oops: 0000 [#1] PREEMPT SMP 
Aug 30 00:01:03 coolbox kernel: Modules linked in: fbcon font bitblit softcursor radeon ttm drm_kms_helper drm autofs4 agpgart fb fbdev cfbcopyarea cfbimgblt cfbfillrect nf_conntrack_ftp xt_tcpudp xt_owner xt_multiport nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT ipt_LOG iptable_filter ip_tables x_tables asus_atk0110 snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss cpufreq_conservative cpufreq_ondemand psmouse pcspkr cx88_blackbird firmware_class cx2341x cx8802 tuner_simple tuner_types tda9887 tda8290 tea5767 tuner cx8800 cx88xx rc_core i2c_algo_bit tveeprom v4l2_common videodev btcx_risc videobuf_dma_sg videobuf_core forcedeth snd_hda_codec_via snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc i2c_nforce2 i2c_core usbhid powernow_k8 processor mperf ohci_hcd ehci_hcd usbcore fuse
Aug 30 00:01:03 coolbox kernel: 
Aug 30 00:01:03 coolbox kernel: Pid: 5484, comm: celestia Not tainted 2.6.40.3 #2 System manufacturer System Product Name/M2N68
Aug 30 00:01:03 coolbox kernel: EIP: 0060:[<f95daad0>] EFLAGS: 00010246 CPU: 1
Aug 30 00:01:03 coolbox kernel: EIP is at ttm_mem_io_lock+0x0/0x20 [ttm]
Aug 30 00:01:03 coolbox kernel: EAX: 9adc8454 EBX: f565dc1c ECX: 00000000 EDX: 00000000
Aug 30 00:01:03 coolbox kernel: ESI: e04f8440 EDI: 9adc8454 EBP: f565dd3c ESP: f565dbe4
Aug 30 00:01:03 coolbox kernel:  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Aug 30 00:01:03 coolbox kernel: Process celestia (pid: 5484, ti=f565c000 task=e06347b0 task.ti=f565c000)
Aug 30 00:01:03 coolbox kernel: Stack:
Aug 30 00:01:03 coolbox kernel:  f95dafa5 f565dc1c f565dd74 f565dc8c f95db162 dfdd402c fffffff4 00040002
Aug 30 00:01:03 coolbox kernel:  e04f8440 f95d9c96 dfdd4064 dfec1640 00040001 00000002 00000002 f565dcc4
Aug 30 00:01:03 coolbox kernel:  e04f8490 e04f8440 00000000 4bee9000 f5e32860 c0107e69 e04f8440 f565dc8c
Aug 30 00:01:03 coolbox kernel: Call Trace:
Aug 30 00:01:03 coolbox kernel:  [<f95dafa5>] ? ttm_mem_reg_iounmap+0x35/0x70 [ttm]
Aug 30 00:01:03 coolbox kernel:  [<f95db162>] ? ttm_bo_move_memcpy+0x182/0x310 [ttm]
Aug 30 00:01:03 coolbox kernel:  [<f95d9c96>] ? ttm_bo_mem_space+0x306/0x3a0 [ttm]
Aug 30 00:01:03 coolbox kernel:  [<c0107e69>] ? nommu_map_page+0x39/0x70
Aug 30 00:01:03 coolbox kernel:  [<f973b030>] ? radeon_bo_move+0xe0/0x330 [radeon]
Aug 30 00:01:03 coolbox kernel:  [<f95d8555>] ? ttm_bo_reserve_locked+0xa5/0x120 [ttm]
Aug 30 00:01:03 coolbox kernel:  [<f95d8bff>] ? ttm_bo_unreserve+0x1f/0x30 [ttm]
Aug 30 00:01:03 coolbox kernel:  [<f973af50>] ? radeon_move_blit.clone.2+0x1f0/0x1f0 [radeon]
Aug 30 00:01:03 coolbox kernel:  [<f95d8e95>] ? ttm_bo_handle_move_mem+0x135/0x340 [ttm]
Aug 30 00:01:03 coolbox kernel:  [<f95d9e5c>] ? ttm_bo_move_buffer+0x12c/0x140 [ttm]
Aug 30 00:01:03 coolbox kernel:  [<f95d9f06>] ? ttm_bo_validate+0x96/0x120 [ttm]
Aug 30 00:01:03 coolbox kernel:  [<f973bfd1>] ? radeon_bo_list_validate+0x71/0xc0 [radeon]
Aug 30 00:01:03 coolbox kernel:  [<f97545e2>] ? radeon_cs_ioctl+0x82/0x1a0 [radeon]
Aug 30 00:01:03 coolbox kernel:  [<c0206466>] ? print_block+0x376/0x510
Aug 30 00:01:03 coolbox kernel:  [<f95efbaf>] ? drm_ioctl+0x18f/0x390 [drm]
Aug 30 00:01:03 coolbox kernel:  [<c0206466>] ? print_block+0x376/0x510
Aug 30 00:01:03 coolbox kernel:  [<f9754560>] ? radeon_cs_finish_pages+0xa0/0xa0 [radeon]
Aug 30 00:01:03 coolbox kernel:  [<c014e373>] ? sched_clock_local+0xc3/0x1b0
Aug 30 00:01:03 coolbox kernel:  [<c010fd5a>] ? x86_pmu_enable+0x1da/0x250
Aug 30 00:01:03 coolbox kernel:  [<c0173f2d>] ? perf_event_task_tick+0xbd/0x220
Aug 30 00:01:03 coolbox kernel:  [<f95efa20>] ? drm_version+0x90/0x90 [drm]
Aug 30 00:01:03 coolbox kernel:  [<c01b7228>] ? do_vfs_ioctl+0x88/0x5e0
Aug 30 00:01:03 coolbox kernel:  [<c03a6b69>] ? schedule+0x1d9/0x680
Aug 30 00:01:03 coolbox kernel:  [<c014c86d>] ? hrtimer_interrupt+0x15d/0x270
Aug 30 00:01:03 coolbox kernel:  [<c0150050>] ? getnstimeofday+0x40/0xe0
Aug 30 00:01:03 coolbox kernel:  [<c01b77bd>] ? sys_ioctl+0x3d/0x70
Aug 30 00:01:03 coolbox kernel:  [<c0206466>] ? print_block+0x376/0x510
Aug 30 00:01:03 coolbox kernel:  [<c03a8e21>] ? syscall_call+0x7/0xb
Aug 30 00:01:03 coolbox kernel:  [<c0206466>] ? print_block+0x376/0x510
Aug 30 00:01:03 coolbox kernel:  [<c0206466>] ? print_block+0x376/0x510
Aug 30 00:01:03 coolbox kernel: Code: 00 00 00 66 31 c0 83 c8 01 89 47 50 eb 9f 90 8d 74 26 00 89 da 89 f0 e8 cf ca ff ff 85 c0 74 a4 89 c5 eb b2 8d b4 26 00 00 00 00 
Aug 30 00:01:03 coolbox kernel: EIP: [<f95daad0>] ttm_mem_io_lock+0x0/0x20 [ttm] SS:ESP 0068:f565dbe4
Aug 30 00:01:03 coolbox kernel: CR2: 000000009adc8491
Aug 30 00:01:03 coolbox kernel: ---[ end trace fed47f0f5bccf5c3 ]---
Comment 3 Michel Dänzer 2011-08-31 10:22:45 UTC
Please attach the full dmesg.

AFAICT ttm_bo_move_memcpy() uses old_copy uninitialized in some error paths. Thomas?
Comment 4 aceman 2011-08-31 10:53:51 UTC
I think this is the full relevant part of dmesg. Actually it is the content of /var/log/syslog after the machine reboot (with Alt-Sysrq-R). Should I capture something else?

(In comment 2 the kernel version is displayed as 2.6.40.3, but that is just a renamed 3.0.3, custom compiled.)
Comment 5 Michel Dänzer 2011-08-31 11:04:19 UTC
(In reply to comment #4)
> Should I capture something else?

Yes, the agp/drm/radeon initialization messages.
Comment 6 aceman 2011-08-31 18:19:20 UTC
Ok, this is from /var/log/messages:


Aug 29 19:04:49 coolbox kernel: Linux agpgart interface v0.103
Aug 29 19:04:49 coolbox kernel: [drm] Initialized drm 1.1.0 20060810
Aug 29 19:04:49 coolbox kernel: [drm] radeon kernel modesetting enabled.
Aug 29 19:04:49 coolbox kernel: radeon 0000:02:00.0: PCI INT A -> Link[LNEB] -> GSI 16 (level, low) -> IRQ 16
Aug 29 19:04:49 coolbox kernel: [drm] initializing kernel modesetting (RV710 0x1002:0x954F).
Aug 29 19:04:49 coolbox kernel: [drm] register mmio base: 0xDFFF0000
Aug 29 19:04:49 coolbox kernel: [drm] register mmio size: 65536
Aug 29 19:04:49 coolbox kernel: ATOM BIOS: 954F.11.12.0.2.AS01
Aug 29 19:04:49 coolbox kernel: radeon 0000:02:00.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used)
Aug 29 19:04:49 coolbox kernel: radeon 0000:02:00.0: GTT: 512M 0x0000000020000000 - 0x000000003FFFFFFF
Aug 29 19:04:49 coolbox kernel: [drm] Detected VRAM RAM=512M, BAR=256M
Aug 29 19:04:49 coolbox kernel: [drm] RAM width 64bits DDR
Aug 29 19:04:49 coolbox kernel: [TTM] Zone  kernel: Available graphics memory: 443484 kiB.
Aug 29 19:04:49 coolbox kernel: [TTM] Zone highmem: Available graphics memory: 1037248 kiB.
Aug 29 19:04:49 coolbox kernel: [TTM] Initializing pool allocator.
Aug 29 19:04:49 coolbox kernel: [drm] radeon: 512M of VRAM memory ready
Aug 29 19:04:49 coolbox kernel: [drm] radeon: 512M of GTT memory ready.
Aug 29 19:04:49 coolbox kernel: [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
Aug 29 19:04:49 coolbox kernel: [drm] Driver supports precise vblank timestamp query.
Aug 29 19:04:49 coolbox kernel: radeon 0000:02:00.0: radeon: using MSI.
Aug 29 19:04:49 coolbox kernel: [drm] radeon: irq initialized.
Aug 29 19:04:49 coolbox kernel: [drm] GART: num cpu pages 131072, num gpu pages 131072
Aug 29 19:04:49 coolbox kernel: [drm] Loading RV710 Microcode
Aug 29 19:04:50 coolbox kernel: radeon 0000:02:00.0: WB enabled
Aug 29 19:04:50 coolbox kernel: [drm] ring test succeeded in 1 usecs
Aug 29 19:04:50 coolbox kernel: [drm] radeon: ib pool ready.
Aug 29 19:04:50 coolbox kernel: [drm] ib test succeeded in 0 usecs
Aug 29 19:04:50 coolbox kernel: [drm] Radeon Display Connectors
Aug 29 19:04:50 coolbox kernel: [drm] Connector 0:
Aug 29 19:04:50 coolbox kernel: [drm]   HDMI-A
Aug 29 19:04:50 coolbox kernel: [drm]   HPD1
Aug 29 19:04:50 coolbox kernel: [drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c
Aug 29 19:04:50 coolbox kernel: [drm]   Encoders:
Aug 29 19:04:50 coolbox kernel: [drm]     DFP1: INTERNAL_UNIPHY
Aug 29 19:04:50 coolbox kernel: [drm] Connector 1:
Aug 29 19:04:50 coolbox kernel: [drm]   VGA
Aug 29 19:04:50 coolbox kernel: [drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
Aug 29 19:04:50 coolbox kernel: [drm]   Encoders:
Aug 29 19:04:50 coolbox kernel: [drm]     CRT2: INTERNAL_KLDSCP_DAC2
Aug 29 19:04:50 coolbox kernel: [drm] Connector 2:
Aug 29 19:04:50 coolbox kernel: [drm]   DVI-I
Aug 29 19:04:50 coolbox kernel: [drm]   HPD4
Aug 29 19:04:50 coolbox kernel: [drm]   DDC: 0x7f10 0x7f10 0x7f14 0x7f14 0x7f18 0x7f18 0x7f1c 0x7f1c
Aug 29 19:04:50 coolbox kernel: [drm]   Encoders:
Aug 29 19:04:50 coolbox kernel: [drm]     CRT1: INTERNAL_KLDSCP_DAC1
Aug 29 19:04:50 coolbox kernel: [drm]     DFP2: INTERNAL_UNIPHY2
Aug 29 19:04:50 coolbox kernel: [drm] Internal thermal controller without fan control
Aug 29 19:04:50 coolbox kernel: [drm] radeon: power management initialized
Aug 29 19:04:50 coolbox kernel: [drm] fb mappable at 0xC0142000
Aug 29 19:04:50 coolbox kernel: [drm] vram apper at 0xC0000000
Aug 29 19:04:50 coolbox kernel: [drm] size 9216000
Aug 29 19:04:50 coolbox kernel: [drm] fb depth is 24
Aug 29 19:04:50 coolbox kernel: [drm]    pitch is 7680
Aug 29 19:04:50 coolbox kernel: fb0: radeondrmfb frame buffer device
Aug 29 19:04:50 coolbox kernel: drm: registered panic notifier
Aug 29 19:04:50 coolbox kernel: [drm] Initialized radeon 2.9.0 20080528 for 0000:02:00.0 on minor 0
Comment 7 Alan 2012-08-24 12:44:13 UTC
If this is still seen in modern (3.2+) kernels, please re-open thanks
Comment 8 aceman 2012-08-30 15:32:57 UTC
Yes, still happens randomly, on kernel 3.5.3, X.org 1.12, ati driver 6.99.99, Mesa git (9.0):

Aug 29 23:46:50 coolbox kernel: [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!
Aug 29 23:47:16 coolbox last message repeated 646 times
Aug 29 23:48:07 coolbox last message repeated 554 times
Aug 29 23:50:15 coolbox kernel: [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!
Aug 29 23:50:41 coolbox last message repeated 6 times
Aug 29 23:50:41 coolbox kernel: vmap allocation for size 178966528 failed: use vmalloc=<size> to increase size.
Aug 29 23:50:41 coolbox kernel: [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!
Aug 29 23:50:49 coolbox last message repeated 131 times
Aug 29 23:50:58 coolbox kernel: ------------[ cut here ]------------
Aug 29 23:50:58 coolbox kernel: kernel BUG at drivers/gpu/drm/ttm/ttm_bo.c:1167!
Aug 29 23:50:58 coolbox kernel: invalid opcode: 0000 [#1] SMP
Aug 29 23:50:58 coolbox kernel: Modules linked in: usb_storage autofs4 nf_conntrack_ftp xt_tcpudp xt_owner xt_multiport nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_LOG iptable_filter ip
_tables x_tables asus_atk0110 snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss cx88_blackbird cx2341x cx8802 tuner_simple tuner_types tda9887 tda8290 tea5767 tuner cx8800 cx88xx r
c_core tveeprom v4l2_common videodev videobuf_dma_sg videobuf_core btcx_risc k10temp forcedeth snd_hda_codec_via snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc i2c_nforce2 ext4 mbcac
he jbd2 crc16 usbhid powernow_k8 mperf ohci_hcd ehci_hcd usbcore usb_common fuse [last unloaded: microcode]
Aug 29 23:50:58 coolbox kernel:
Aug 29 23:50:58 coolbox kernel: Pid: 19931, comm: celestia Not tainted 2.6.45.3 #93 System manufacturer System Product Name/M2N68
Aug 29 23:50:58 coolbox kernel: EIP: 0060:[<c030a069>] EFLAGS: 00010206 CPU: 3
Aug 29 23:50:58 coolbox kernel: EIP is at ttm_bo_check_placement+0x19/0x20
Aug 29 23:50:58 coolbox kernel: EAX: e1f12c2c EBX: 00002aac ECX: 00000000 EDX: 00000100
Aug 29 23:50:58 coolbox kernel: ESI: ec13c468 EDI: ebbf71c0 EBP: 00021240 ESP: eaa33d70
Aug 29 23:50:58 coolbox kernel:  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Aug 29 23:50:58 coolbox kernel: CR0: 8005003b CR2: aa634000 CR3: 2afab000 CR4: 000007f0
Aug 29 23:50:58 coolbox kernel: DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Aug 29 23:50:58 coolbox kernel: DR6: ffff0ff0 DR7: 00000400
Aug 29 23:50:58 coolbox kernel: Process celestia (pid: 19931, ti=eaa32000 task=ec27a030 task.ti=eaa32000)
Aug 29 23:50:58 coolbox kernel: Stack:
Aug 29 23:50:58 coolbox kernel:  c030b4c9 00000000 00000000 01aac000 e1f12c2c e1f12c00 fffffff4 00000001
Aug 29 23:50:58 coolbox kernel:  02aac000 c033bb07 00000000 e1f12c14 00000001 00000000 00000001 00000000
Aug 29 23:50:58 coolbox kernel:  00021240 00000000 c033b850 00000001 00000001 00000000 ec13c468 ec13cd38
Aug 29 23:50:58 coolbox kernel: Call Trace:
Aug 29 23:50:58 coolbox kernel:  [<c030b4c9>] ? ttm_bo_init+0x179/0x370
Aug 29 23:50:58 coolbox kernel:  [<c033bb07>] ? radeon_bo_create+0x197/0x290
Aug 29 23:50:58 coolbox kernel:  [<c033b850>] ? radeon_bo_clear_va+0x80/0x80
Aug 29 23:50:58 coolbox kernel:  [<c034c2bc>] ? radeon_gem_object_create+0x5c/0xf0
Aug 29 23:50:58 coolbox kernel:  [<c034c696>] ? radeon_gem_create_ioctl+0x66/0xf0
Aug 29 23:50:58 coolbox kernel:  [<c0275cb3>] ? _copy_from_user+0x33/0x70
Aug 29 23:50:58 coolbox kernel:  [<c034c630>] ? radeon_gem_pwrite_ioctl+0x30/0x30
Aug 29 23:50:58 coolbox kernel:  [<c02f349c>] ? drm_ioctl+0x36c/0x3d0
Aug 29 23:50:58 coolbox kernel:  [<c01c645d>] ? sys_umount+0x37d/0x380
Aug 29 23:50:58 coolbox kernel:  [<c01c645d>] ? sys_umount+0x37d/0x380
Aug 29 23:50:58 coolbox kernel:  [<c034c630>] ? radeon_gem_pwrite_ioctl+0x30/0x30
Aug 29 23:50:58 coolbox kernel:  [<c019420a>] ? free_pgtables+0x8a/0xb0
Aug 29 23:50:58 coolbox kernel:  [<c0193d49>] ? tlb_finish_mmu+0x9/0x30
Aug 29 23:50:58 coolbox kernel:  [<c02f3130>] ? drm_copy_field+0x70/0x70
Aug 29 23:50:58 coolbox kernel:  [<c01bc24a>] ? do_vfs_ioctl+0x7a/0x580
Aug 29 23:50:58 coolbox kernel:  [<c019ab25>] ? do_munmap+0x245/0x300
Aug 29 23:50:58 coolbox kernel:  [<c01c645d>] ? sys_umount+0x37d/0x380
Aug 29 23:50:58 coolbox kernel:  [<c01bc77e>] ? sys_ioctl+0x2e/0x50
Aug 29 23:50:58 coolbox kernel:  [<c0486cc5>] ? syscall_call+0x7/0xb
Aug 29 23:50:58 coolbox kernel:  [<c01c645d>] ? sys_umount+0x37d/0x380
Aug 29 23:50:58 coolbox kernel:  [<c01c645d>] ? sys_umount+0x37d/0x380
Aug 29 23:50:58 coolbox kernel: Code: 8b 6c 24 14 83 c4 18 c3 8d 76 00 8d bc 27 00 00 00 00 8b 0a 85 c9 75 09 83 7a 04 00 75 03 31 c0 c3 8b 52 04 29 ca 39 50 44 76 f3 <0f> 0b 90 8d 74 26 00 8b 4a 14 53 6b d9 50
 8b 44 18 1c a8 01 75
Aug 29 23:50:58 coolbox kernel: EIP: [<c030a069>] ttm_bo_check_placement+0x19/0x20 SS:ESP 0068:eaa33d70
Aug 29 23:50:58 coolbox kernel: ---[ end trace c7e3e649e5a39cf1 ]---
Comment 9 Alex Deucher 2013-12-23 14:36:19 UTC
This is an out of memory error.  Is it still an issue on a newer kernel and gfx stack?
Comment 10 aceman 2013-12-23 21:55:41 UTC
I do not have a working celestia at this time so I can't test it in the near future.
Comment 11 aceman 2018-02-28 21:12:29 UTC
I no longer have that particular GPU but I also haven't seen the problem in ages. I added the vmalloc argument to the kernel cmdline, that may also have helped.
With the recent amdgpu kernel driver I am trying to run without this argument and haven't seen any problems with Celestia yet. I also have 4GB of VRAM on a Polaris11 GPU now.