Operating System: Both Debian GNU/Linux and Gentoo Linux Latest working: Any kernel compiled for i386 (x86) works fine yet 2.6.25/2.6.26 fail miserably when x86_64 More information over at -- http://bugs.gentoo.org/show_bug.cgi?id=232857 Board is an Intel D945GCLF with Atom 230, 1 x 2GB DIMM of DDR2, 2 x 100GB SATA. Will attach kernel configurations for working 2.6.26 kernel (x86) and the configuration for our amd64 kernel which fails to acquire DHCP lease, plus output from the boot sequence of the box on both kernels. Our environment is pretty simple. Network boot a box via PXE and send kernel via TFTP then mount "rescue environment" (simple nfsroot) as root filesystem. From there we do recovery stuff and operating system installations. Happy to take any suggestions or patches and test them out.
Created attachment 16967 [details] kernel configuration, 2.6.26 (x86_64)
Created attachment 16968 [details] boot output, 2.6.26 (x86_64)
Created attachment 16969 [details] kernel configuration, 2.6.26 (x86)
Created attachment 16970 [details] boot output, 2.6.26 (x86_64)
Created attachment 16971 [details] hardware configuration taken from `lshw` on a working 32-bit installation of Debian Etch
2.6.26-git contains some 810{1/2} related changes. Can you give its r8169 driver a try ? -- Ueimor
Created attachment 16984 [details] boot output, 2.6.26-git (x86_64) To confirm I've pulled the latest with: git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-2.6 It appears to get past the r8169 issue and *does* grab a DHCP lease which is significant progress but then we get a nasty trace involving SMP. Log attached. I'm going to try taking 2.6.26-release and merging your r8169 patches only rather than using the whole linux-2.6.git and see if that works?
Same issue with just the r8169.c merged and 2.6.26-release but the trace is a tiny bit different, still pretty fatal though :) [ 11.172641] VFS: Mounted root (nfs filesystem) readonly. [ 11.178161] Freeing unused kernel memory: 320k freed [ 11.187383] Write protecting the kernel read-only data: 7108k [ 11.192652] BUG: unable to handle kernel paging request at ffffffff8101f239 [ 11.192652] IP: [<ffffffff81019f43>] smp_call_function+0x1/0x1a [ 11.192652] PGD 1003067 PUD 1007063 PMD 7f239163 PTE 101f161 [ 11.192652] Oops: 0003 [1] SMP [ 11.192652] CPU 0 [ 11.192652] Modules linked in: [ 11.192652] Pid: 1, comm: swapper Not tainted 2.6.26-amd64 #1 [ 11.192652] RIP: 0010:[<ffffffff81019f43>] [<ffffffff81019f43>] smp_call_function+0x1/0x1a [ 11.192652] RSP: 0000:ffff81007f07fde0 EFLAGS: 00010246 [ 11.192652] RAX: ffffffff81009000 RBX: 0000000000000000 RCX: 0000000000000001 [ 11.192652] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffff8101f239 [ 11.192652] RBP: ffffffff81009000 R08: ffff810000000000 R09: 00003ffffffff000 [ 11.192652] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 11.192652] R13: ffffffff8101f239 R14: 0000000000000000 R15: 00000000000006f1 [ 11.192652] FS: 0000000000000000(0000) GS:ffffffff8177e000(0000) knlGS:0000000000000000 [ 11.192652] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 11.192652] CR2: ffffffff8101f239 CR3: 0000000001001000 CR4: 00000000000006e0 [ 11.192652] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 11.192652] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 11.192652] Process swapper (pid: 1, threadinfo ffff81007f07e000, task ffff81007f07d750) [ 11.192652] Stack: ffffffff81009000 ffffffff810345e7 0000000000000000 0000000000000000 [ 11.192652] 0000000000000000 ffffffff8101fd9e ffffffff816fa000 0000000000000000 [ 11.192652] 0000000000000002 0000000100000001 00000000000016f9 0000000000000246 [ 11.192652] Call Trace: [ 11.192652] [<ffffffff81009000>] ? run_init_process+0x0/0x1a [ 11.192652] [<ffffffff810345e7>] ? on_each_cpu+0x10/0x22 [ 11.192652] [<ffffffff8101fd9e>] ? change_page_attr_set_clr+0x134/0x1ac [ 11.192652] [<ffffffff81009000>] ? run_init_process+0x0/0x1a [ 11.192652] [<ffffffff8101db64>] ? mark_rodata_ro+0x4a/0x6c [ 11.192652] [<ffffffff817f0056>] ? ip_auto_config+0x0/0xde6 [ 11.192652] [<ffffffff8100902d>] ? init_post+0x13/0xf2 [ 11.192652] [<ffffffff817cd832>] ? kernel_init+0x281/0x292 [ 11.192652] [<ffffffff8100bdf8>] ? child_rip+0xa/0x12 [ 11.192652] [<ffffffff817cd5b1>] ? kernel_init+0x0/0x292 [ 11.192652] [<ffffffff8100bdee>] ? child_rip+0x0/0x12 [ 11.192652] [ 11.192652] [ 11.192652] Code: 74 1d 48 8b 15 5f dd 78 00 48 63 c7 4c 8b 1d c5 3f 6e 00 48 8b 3c c2 4c 89 ca 41 58 41 ff e3 fa 48 89 d7 ff d6 fb 5a 31 c0 c3 48 <89> f8 48 89 f2 48 8b 3d e9 de 78 00 48 89 c6 4c 8b 1d 97 3f 6e [ 11.192652] RIP [<ffffffff81019f43>] smp_call_function+0x1/0x1a [ 11.192652] RSP <ffff81007f07fde0> [ 11.192652] CR2: ffffffff8101f239 [ 11.192652] ---[ end trace b53d9db9c1b0ffdc ]--- [ 11.192652] Kernel panic - not syncing: Attempted to kill init!
Any tips on debug options I could enable, or extra statements to insert into the code to try and debug this?
It could be interesting to check this thread: http://thread.gmane.org/gmane.linux.kernel/718708 Otherwise the boot log shows that your 810x chipset is not completely identified. I'd suggest applying #0001 to #0006 from http://userweb.kernel.org/~romieu/r8169/2.6.27-rc2/20080808 on top of 2.6.27-rc2. -- Ueimor
Realtek released a driver for NIC based on RTL8100E/RTL8101E/RTL8102E-GR chips. The driver is available at http://www.realtek.com.tw/Downloads/downloadsView.aspx?Langid=1&PNid=14&PFid=7&Level=5&Conn=4&DownTypeID=3&GetDown=false. Driver v.1.009 cleanly compile against vanilla kernel 2.6.26 and it works good on my Gentoo box. I hope it helps.
Hi Calori :) The problem is fairly specific, that the system fails to boot on x86_64 *only* (works great on x86) due to problems acquiring a DHCP lease after the kernel is sent over via TFTP but before userland has started such that an NFS root can be mounted and network boot can proceed! Francois -- thanks for the new patches, I'll give those a shot today against 2.6.27-rc2 and report back if they resolved the issue. Should be in 3-4 hours.
2.6.27-rc4 with a 'make mrproper && make defconfig' then fresh configuration of the kernel seems to have solved the issue :) Woot. Thanks for your time Francois.
I will wait a few days before closing the bug in order to be sure that it is reliably fixed. -- Ueimor
Just did a build of 2.6.27-rc5: works fine! http://kerneltrap.org/mailarchive/linux-kernel/2008/8/30/3139664/thread That looks "interesting" but I'm not sure if it's related.