Distribution: Debian Hardware Environment: Asus P4T533-C; 3Ghz PIV; 1GB RAM Software Environment: Kernel 2.5.61 built without mkinitrd; boot with Lilo Problem Description: Nothing displayed after "Booting Linux..." (last message from Lilo) Lilo 22.4; module-init-tools 0.9.9 Steps to reproduce: Build-n-boot.
Please attach compiler version info and .config for the kernel
Things to try: 1, acpi=off vga=1 2, building kernel as i386 instead of higher
I'm having a very similar problem with 2.5.60, 2.5.65, 2.5.68, 2.5.69, and 2.5.69-ac1. GRUB says it loads the kernel correctly, uncompresses, signals that it is handing control to the new kernel, and then hangs immediately after. Here are the relevant specs: Motherboard: Intel 840 Workstation board w/ Dual Intel P-III 866MHz (Coppermine) CPUs, 512MB RDRAM. Linux distribution: Redhat 9 w/ latest up2dates, and Rusty's module-init-tools-0.9.11a GCC version: gcc -v Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/3.2.2/specs Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --host=i386-redhat-linux Thread model: posix gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5) .config and lspci: See attached. The systems works just fine with 2.4 kernels (stock and Redhat). I've tried: 1) Compiling with cpu=generic i386 2) acpi=off vga=1 Any suggestions would be greatly appreciated.
Created attachment 335 [details] .config file from 2.5.69-ac1
Created attachment 336 [details] lspci output
Discovered problem (on my box) is two-fold: 1) Until recently, SMP systems had hard time mounting root FS, when root was EXT2, EXT3 or XFS. (subsequently fixed) 2) The kernel is not able to load modules. Modules that caused problems were ATKBD and serial IO. Result was unable to find /dev/tty? and /dev/ttyS?. It would boot, but you can't log in :-)
Created attachment 341 [details] early printk Select the VGA output section in the early printk section part of 'Kernel Hacking'
Thanks Zwane for the suggestion, though still no joy. I've also recompiled as UP, hoping that may simplify things. Anyhow, I'm not afraid to get my hands dirty in some C code, so tell me where to start sprinkling (early-)printk's and I'll narrow down where things are blowing up. It is also possible for me to attach a serial console, though I'd have to dig up a NULL-modem cable from somewhere. Anyhow -- just point me in a useful direction...
Hmmm... I've read over the early-printk patch, and things don't look good. The printk right after register_early_consoles() never makes it to the screen. Any other things to try?
Created attachment 344 [details] bare config Could you try building a kernel with this configuration and trying to boot it?
Zwane, it looks like you attached the wrong config file. Please re-send and I'll try ASAP.
Created attachment 350 [details] bare'ish config I attached the wrong config file previously
Good news! My system boots with the minimal config (though it can't so much more than just boot). I'll start re-adding things to the config until something goes boom. Any suggestions?
Avoid any power management related things like ACPI, for now just turn on your root fs and any IDE/SCSI controllers you may require
How is this looking in 2.5.70?
Works much better in 2.5.69-bk17. Have not built 2.5.70 yet...
I'm not running 2.5.70-mm1, which works with a .config based on the one that Zwane sent (with all the things I need added). Things look good except that I am getting ~66000 interrupts per second due to the ACPI: CPU0 CPU1 0: 24458828 24627983 IO-APIC-edge timer 1: 2 12 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 8: 1 0 IO-APIC-edge rtc 9: 1600921506 1600747252 IO-APIC-level acpi 12: 4 51 IO-APIC-edge i8042 14: 1 1 IO-APIC-edge ide0 15: 2 0 IO-APIC-edge ide1 17: 24 25 IO-APIC-level aic7xxx 19: 909103 903356 IO-APIC-level 3ware Storage Controller, eth0 NMI: 1460789 1472578 LOC: 49085767 49085766 ERR: 0 MIS: 0 OProfile while copying a large tree of hardlinks (cp -Rl x y): vma samples % symbol name c01caf95 1217698 41.5755 acpi_os_read_port c0108bf0 1200771 40.9975 default_idle c01d633a 150704 5.14544 acpi_hw_low_level_read c010d8a0 54398 1.85729 do_IRQ c010b240 23926 0.816897 irq_entries_start c0174b40 23354 0.797368 __d_lookup c01d019c 20739 0.708085 acpi_ev_gpe_detect c01cb3ef 20002 0.682922 acpi_os_acquire_lock c01d60bc 16298 0.556457 acpi_hw_register_read c01164b0 16033 0.547409 mark_offset_tsc c0119980 13953 0.476393 end_level_ioapic_irq c01764a0 12797 0.436924 find_inode_fast c019e460 11037 0.376833 ext3_find_entry c01ce95f 8994 0.307079 acpi_ev_fixed_event_detect c019f110 8456 0.28871 add_dirent_to_buf c010d560 8144 0.278058 handle_IRQ_event c01696b0 6286 0.214621 link_path_walk c0111e70 5330 0.18198 timer_interrupt c010bac0 5297 0.180854 common_interrupt c01975c0 5266 0.179795 ext3_check_dir_entry c010b17a 4528 0.154598 restore_all c0108c80 4177 0.142614 cpu_idle [...] vmstat during that copy: [...] procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 1 0 0 4648 232224 9176 0 0 228 0 66668 124 1 45 55 0 1 0 0 5288 231596 9192 0 0 332 412 66666 229 0 49 51 2 1 2 0 4312 232292 9176 0 0 340 1940 65822 268 1 50 48 0 1 0 0 5160 231440 9212 0 0 348 296 66538 581 5 53 43 0 1 0 0 4648 232136 9128 0 0 472 0 66931 247 0 50 50 0 1 0 0 5096 231688 9168 0 0 440 400 66684 286 0 52 48 0 1 0 0 4456 232376 9160 0 0 448 0 66890 235 1 50 49 0 1 0 0 4968 231796 9264 0 0 364 2128 65770 241 0 58 42 0 1 0 0 4456 232300 9236 0 0 336 12 66989 184 1 47 52 0 1 0 0 5608 231340 9244 0 0 276 392 66930 159 0 51 49 0 1 0 0 5224 231728 9264 0 0 268 0 66825 142 0 49 51 0 1 0 0 4712 232252 9284 0 0 356 0 66992 191 0 50 50 0 1 0 0 4264 232776 9168 0 0 332 1368 66248 205 1 51 48 0 1 0 0 5096 231848 9280 0 0 220 444 66642 165 0 51 49 0 1 0 0 4648 232356 9248 0 0 344 0 67003 181 0 49 51 0 1 0 0 4584 232536 9272 0 0 760 404 66309 578 1 51 48 0 1 0 0 4072 232972 9244 0 0 296 0 66946 157 0 51 49 0 1 0 0 5352 231876 9252 0 0 64 2944 65474 491 6 59 35 [...]
er, s/not/now/
Created attachment 382 [details] disable busmaster event checking We really should open a new bug for this... But can you try this patch and then we'll look at opening a new bug. This patch essentially disables bus master event monitoring in ACPI
Thanks Zwane! Disabled ACPI fixes the problem (obviously) and so does your patch. Here is /proc/interrupts with your patch: CPU0 CPU1 0: 86839 70698 IO-APIC-edge timer 1: 0 12 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 8: 0 1 IO-APIC-edge rtc 9: 50012 49988 IO-APIC-level acpi 12: 5 50 IO-APIC-edge i8042 14: 0 2 IO-APIC-edge ide0 15: 0 2 IO-APIC-edge ide1 17: 24 25 IO-APIC-level aic7xxx 19: 1830 1862 IO-APIC-level 3ware Storage Controller, eth0 NMI: 0 0 LOC: 157392 157391 ERR: 0 MIS: 0 Let me know if there is anything else I can do to diagnose the root problem. I wouldn't be too surprised if my motherboard has broken ACPI -- it was one of the first RDRAM+333MHz FSB motherboards.
Could you please open up a new bug and add your comments starting from #17, i will also close this bug with resolution set to INVALID since it was a configuration error.