Bug 210841 - divide error in Calculate256BBlockSizes
Summary: divide error in Calculate256BBlockSizes
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-22 07:06 UTC by Neil Shepperd
Modified: 2020-12-22 07:07 UTC (History)
0 users

See Also:
Kernel Version: 5.9.14
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Full kernel log from journalctl -b-1 -k (243.67 KB, text/plain)
2020-12-22 07:06 UTC, Neil Shepperd
Details

Description Neil Shepperd 2020-12-22 07:06:06 UTC
Created attachment 294287 [details]
Full kernel log from journalctl -b-1 -k

display crashed with this error message: 

    divide error: 0000 [#1] PREEMPT SMP NOPTI

This looks superficially similar to https://bugzilla.kernel.org/show_bug.cgi?id=206987 or https://bugzilla.kernel.org/show_bug.cgi?id=207979; but my kernel version has the patch that should prevent this, and the crash also seems to be in a different place.

   RIP: 0010:Calculate256BBlockSizes+0x74/0x100 [amdgpu]

objdump -d amdgpu.ko

00000000002423d0 <Calculate256BBlockSizes>:
  2423d0:       e8 00 00 00 00          callq  2423d5 <Calculate256BBlockSizes+0x5>
  2423d5:       89 f0                   mov    %esi,%eax
  2423d7:       4c 8b 5c 24 10          mov    0x10(%rsp),%r11
  2423dc:       89 d6                   mov    %edx,%esi
  2423de:       41 89 ca                mov    %ecx,%r10d
  2423e1:       83 ff 02                cmp    $0x2,%edi
  2423e4:       76 47                   jbe    24242d <Calculate256BBlockSizes+0x5d>
  2423e6:       83 ff 08                cmp    $0x8,%edi
  2423e9:       74 42                   je     24242d <Calculate256BBlockSizes+0x5d>
  2423eb:       85 c0                   test   %eax,%eax
  2423ed:       0f 85 8a 00 00 00       jne    24247d <Calculate256BBlockSizes+0xad>
  2423f3:       41 c7 00 01 00 00 00    movl   $0x1,(%r8)
  2423fa:       41 c7 01 01 00 00 00    movl   $0x1,(%r9)
  242401:       b9 00 01 00 00          mov    $0x100,%ecx
  242406:       31 d2                   xor    %edx,%edx
  242408:       89 c8                   mov    %ecx,%eax
  24240a:       f7 f6                   div    %esi
  24240c:       31 d2                   xor    %edx,%edx
  24240e:       41 f7 30                divl   (%r8)
  242411:       48 8b 54 24 08          mov    0x8(%rsp),%rdx
  242416:       89 02                   mov    %eax,(%rdx)
  242418:       89 c8                   mov    %ecx,%eax
  24241a:       31 d2                   xor    %edx,%edx
  24241c:       41 f7 f2                div    %r10d
  24241f:       31 d2                   xor    %edx,%edx
  242421:       41 f7 31                divl   (%r9)
  242424:       41 89 03                mov    %eax,(%r11)
  242427:       b8 01 00 00 00          mov    $0x1,%eax
  24242c:       c3                      retq
  24242d:       85 c0                   test   %eax,%eax
  24242f:       75 34                   jne    242465 <Calculate256BBlockSizes+0x95>
  242431:       41 c7 00 01 00 00 00    movl   $0x1,(%r8)
  242438:       b9 01 00 00 00          mov    $0x1,%ecx
  24243d:       31 d2                   xor    %edx,%edx
  24243f:       b8 00 01 00 00          mov    $0x100,%eax
    242444:       f7 f6                   div    %esi            <-- [Calculate256BBlockSizes+0x74]
  242446:       48 8b 74 24 08          mov    0x8(%rsp),%rsi
  24244b:       31 d2                   xor    %edx,%edx
  24244d:       f7 f1                   div    %ecx

So the crash is a div instruction. I think corresponding to either the 256 / BytePerPixelY or 256 / BytePerPixelC in Calculate256BBlockSizes. So it seems that one of those is being passed in as 0 from dml21_ModeSupportAndSystemConfigurationFull?

-- 

[Dec22 16:25] divide error: 0000 [#1] PREEMPT SMP NOPTI
[  +0.000008] CPU: 9 PID: 842 Comm: Xorg Tainted: G        W         5.9.14-arch1-1 #1
[  +0.000003] Hardware name: LENOVO 20UHCTO1WW/20UHCTO1WW, BIOS R1CET56W(1.25 ) 09/15/2020
[  +0.000182] RIP: 0010:Calculate256BBlockSizes+0x74/0x100 [amdgpu]
[  +0.000006] Code: 31 d2 41 f7 f2 31 d2 41 f7 31 41 89 03 b8 01 00 00 00 c3 85 c0 75 34 41 c7 00 01 00 00 00 b9 01 00 00 00 31 d2 b8 00 01 00 00 <f7> f6 48 8b 74 24 08 31 d2 f7 f1 89 06 b8 01 00 00 00 41 c7 01 00
[  +0.000003] RSP: 0018:ffffa9d3c195b650 EFLAGS: 00010246
[  +0.000003] RAX: 0000000000000100 RBX: ffff8ea85cba31c8 RCX: 0000000000000008
[  +0.000002] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
[  +0.000002] RBP: 0000000000000000 R08: ffff8ea85cba9914 R09: ffff8ea85cba9954
[  +0.000001] R10: 0000000000000000 R11: ffff8ea85cba9974 R12: ffff8ea85cba9914
[  +0.000002] R13: 0000000000000000 R14: ffff8ea85cba2830 R15: ffff8ea85cba2090
[  +0.000003] FS:  00007f4289f3a2c0(0000) GS:ffff8eab3fa40000(0000) knlGS:0000000000000000
[  +0.000002] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000002] CR2: 00007f80bc2da000 CR3: 00000003ea4fc000 CR4: 0000000000350ee0
[  +0.000002] Call Trace:
[  +0.000100]  dml21_ModeSupportAndSystemConfigurationFull+0x1109/0x5d10 [amdgpu]
[  +0.000005]  ? sugov_fast_switch+0x5b/0xe0
[  +0.000002]  ? update_load_avg+0x62d/0x660
[  +0.000001]  ? update_curr+0x73/0x1f0
[  +0.000002]  ? dequeue_entity+0xa4/0x440
[  +0.000001]  ? newidle_balance+0xec/0x400
[  +0.000001]  ? preempt_count_add+0x49/0xa0
[  +0.000003]  ? kernel_init_free_pages+0x6d/0x90
[  +0.000001]  ? prep_new_page+0xa2/0xb0
[  +0.000001]  ? get_page_from_freelist+0x11e0/0x12d0
[  +0.000002]  ? __mod_zone_page_state+0x66/0xa0
[  +0.000002]  ? __free_one_page+0x58/0x3d0
[  +0.000001]  ? __mod_zone_page_state+0x66/0xa0
[  +0.000000]  ? preempt_count_add+0x49/0xa0
[  +0.000002]  ? kernel_init_free_pages+0x6d/0x90
[  +0.000001]  ? prep_new_page+0xa2/0xb0
[  +0.000001]  ? get_page_from_freelist+0x11e0/0x12d0
[  +0.000058]  dml_get_voltage_level+0x138/0x1d0 [amdgpu]
[  +0.000062]  dcn20_fast_validate_bw+0x359/0x670 [amdgpu]
[  +0.000057]  ? dcn21_validate_bandwidth+0x95/0x2b0 [amdgpu]
[  +0.000056]  dcn21_validate_bandwidth+0xd3/0x2b0 [amdgpu]
[  +0.000055]  dc_validate_global_state+0x2f2/0x390 [amdgpu]
[  +0.000060]  amdgpu_dm_atomic_check+0xf05/0x1080 [amdgpu]
[  +0.000016]  drm_atomic_check_only+0x57c/0x7f0 [drm]
[  +0.000011]  ? drm_mode_object_put.part.0+0x1f/0x50 [drm]
[  +0.000009]  ? drm_atomic_set_property+0x8e/0xae0 [drm]
[  +0.000009]  drm_atomic_commit+0x13/0x50 [drm]
[  +0.000009]  drm_mode_obj_set_property_ioctl+0x156/0x3d0 [drm]
[  +0.000009]  ? drm_mode_obj_find_prop_id+0x40/0x40 [drm]
[  +0.000008]  drm_ioctl_kernel+0xb2/0x100 [drm]
[  +0.000009]  drm_ioctl+0x215/0x390 [drm]
[  +0.000009]  ? drm_mode_obj_find_prop_id+0x40/0x40 [drm]
[  +0.000043]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[  +0.000004]  __x64_sys_ioctl+0x83/0xb0
[  +0.000003]  do_syscall_64+0x33/0x40
[  +0.000002]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  +0.000002] RIP: 0033:0x7f428a901f6b
[  +0.000001] Code: 89 d8 49 8d 3c 1c 48 f7 d8 49 39 c4 72 b5 e8 1c ff ff ff 85 c0 78 ba 4c 89 e0 5b 5d 41 5c c3 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 ae 0c 00 f7 d8 64 89 01 48
[  +0.000001] RSP: 002b:00007ffd2caa5758 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  +0.000001] RAX: ffffffffffffffda RBX: 00007ffd2caa5790 RCX: 00007f428a901f6b
[  +0.000000] RDX: 00007ffd2caa5790 RSI: 00000000c01864ba RDI: 000000000000000d
[  +0.000001] RBP: 00000000c01864ba R08: 000000000000006a R09: 00000000cccccccc
[  +0.000001] R10: 0000000000000fff R11: 0000000000000246 R12: 00005611da332f60
[  +0.000000] R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000003
[  +0.000002] Modules linked in: iptable_mangle iptable_raw xt_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_mark ip6table_mangle xt_comment xt_addrtype ip6table_raw ip6_tables wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 ip6_udp_tunnel udp_tunnel libcurve25519_generic libchacha libblake2s_generic uas usb_storage uinput rfcomm 8021q garp mrp stp llc joydev mousedev snd_soc_dmic snd_acp3x_pdm_dma snd_acp3x_rn ccm bnep snd_soc_core algif_aead snd_compress des_generic tps6598x ac97_bus libdes snd_pcm_dmaengine roles wmi_bmof ecb algif_skcipher amdgpu snd_hda_codec_realtek iwlmvm cmac snd_hda_codec_generic md4 algif_hash edac_mce_amd snd_hda_codec_hdmi af_alg kvm_amd snd_hda_intel snd_intel_dspcfg kvm mac80211 snd_hda_codec btusb irqbypass btrtl rapl libarc4 uvcvideo btbcm snd_hda_core btintel gpu_sched snd_hwdep i2c_algo_bit pcspkr videobuf2_vmalloc thinkpad_acpi ttm videobuf2_memops psmouse input_leds iwlwifi snd_pcm
[  +0.000027]  videobuf2_v4l2 ext4 bluetooth drm_kms_helper nvram videobuf2_common k10temp r8169 cec snd_timer ecdh_generic mbcache ledtrig_audio sp5100_tco ecc realtek jbd2 i2c_piix4 tpm_crb crc16 rc_core cfg80211 videodev snd_rn_pci_acp3x mdio_devres snd syscopyarea sysfillrect of_mdio sysimgblt snd_pci_acp3x fb_sys_fops soundcore ucsi_acpi fixed_phy mc tpm_tis ipmi_devintf libphy rfkill ac battery ipmi_msghandler typec_ucsi i2c_scmi evdev mac_hid i2c_multi_instantiate typec tpm_tis_core wmi acpi_cpufreq pinctrl_amd drm pkcs8_key_parser crypto_user fuse agpgart ip_tables x_tables dm_crypt cbc encrypted_keys dm_mod trusted tpm rtsx_pci_sdmmc serio_raw mmc_core atkbd libps2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper ccp xhci_pci xhci_pci_renesas rng_core xhci_hcd ehci_pci rtsx_pci ehci_hcd i8042 serio btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
[  +0.000050] ---[ end trace a3da0200bd831ed5 ]---
[  +0.000058] RIP: 0010:Calculate256BBlockSizes+0x74/0x100 [amdgpu]
[  +0.000027] Code: 31 d2 41 f7 f2 31 d2 41 f7 31 41 89 03 b8 01 00 00 00 c3 85 c0 75 34 41 c7 00 01 00 00 00 b9 01 00 00 00 31 d2 b8 00 01 00 00 <f7> f6 48 8b 74 24 08 31 d2 f7 f1 89 06 b8 01 00 00 00 41 c7 01 00
[  +0.000000] RSP: 0018:ffffa9d3c195b650 EFLAGS: 00010246
[  +0.000001] RAX: 0000000000000100 RBX: ffff8ea85cba31c8 RCX: 0000000000000008
[  +0.000001] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
[  +0.000000] RBP: 0000000000000000 R08: ffff8ea85cba9914 R09: ffff8ea85cba9954
[  +0.000001] R10: 0000000000000000 R11: ffff8ea85cba9974 R12: ffff8ea85cba9914
[  +0.000001] R13: 0000000000000000 R14: ffff8ea85cba2830 R15: ffff8ea85cba2090
[  +0.000001] FS:  00007f4289f3a2c0(0000) GS:ffff8eab3fa40000(0000) knlGS:0000000000000000
[  +0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000001] CR2: 00007f80bc2da000 CR3: 00000003ea4fc000 CR4: 0000000000350ee0
Comment 1 Neil Shepperd 2020-12-22 07:07:40 UTC
The laptop was unresponsive, but I could ssh in to read dmesg and attempt reboot (which seemingly failed, requiring hard reset).

Note You need to log in before you can comment on or make changes to this bug.