We've found a severe/grave problem with the 3.2 & 3.3-rc1 kernels on certain Kirkwood machines. Problem: (compressed uImage) Kernel will not boot on many Kirkwood devices (Dockstar, some PogoPlugs, others) Package(s): Linux Kernel 3.2.x and 3.3.0-rc1 Steps to reproduce: Build a uImage kernel either natively on Debian Squeeze or Wheezy, with build-essential etc., with the CodeSourcery CrossCompile ToolChain, with ArchLinux or any other kernel build setup. Attempts at booting these 3.2 and 3.3 kernels show a complete dead hang on the serial output: ==================================== ## Booting kernel from Legacy Image at 00800000 ... Image Name: Linux-3.3.0-rc1-kirkwood Image Type: ARM Linux Kernel Image (uncompressed) Data Size: 1626136 Bytes = 1.6 MiB Load Address: 00008000 Entry Point: 00008000 Verifying Checksum ... OK ## Loading init Ramdisk from Legacy Image at 01100000 ... Image Name: initramfs-3.3.0-rc1-kirkwood Image Type: ARM Linux RAMDisk Image (gzip compressed) Data Size: 5778790 Bytes = 5.5 MiB Load Address: 00000000 Entry Point: 00000000 Verifying Checksum ... OK Loading Kernel Image ... OK OK Starting kernel ... Uncompressing Linux... done, booting the kernel. ==================================== The same behavior is seen whether we use "make uImage" or "make-kpkg --rootcmd fakeroot --arch armel --append-to-version=-kirkwood --revision=1.0 --initrd kernel_image" Results are the same, a non-booting uImage. Users from some forums have noted that using gzip vs. lzma (or vice versa) for the compression changes the results sometimes, but not alway. Behavior is unpredictable, it seems. We have confirmed that an __uncompressed__ kernel will boot completely. This is not the default for Debian armel packages or other Kirkwood installations, though, and won't be suitable as a longterm fix. Two things to note and clarify: 1. this is _not_ the arch/arm/asm/bug.h compile time problem that is causing problems in ARM 2. 3.1.10 works just fine on Kirkwood - no problems there. Here are some links that show a few of the discussions that are going on regarding this: •Re: Linux kernels 3.2 & 3.3-rc1 are broken! http://forum.doozan.com/read.php?2,6550,6868#msg-6868 •new kernel 3.2 does not boot on the dockstar http://archlinuxarm.org/forum/viewtopic.php?f=18&t=2314
Cc-ing Raphael since this is a regression (and so should block bug 42566 to get on the list). purdyd, can you bisect? It works somewhat like this (feel free to tweak for cross-compilation and building on a different machine from where the kernel runs as appropriate): 0. Prerequisites: apt-get install git build-essential 1. Grab the kernel, with history: git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git cd linux Or, if you already have a git checkout of the kernel, update it: cd linux git fetch origin git checkout origin/master 2. Configure: cp /boot/config-$(uname -r) .config; # current configuration make localmodconfig; # minimal configuration make nconfig; # tweak configuration 3. Test a known-broken version: make deb-pkg; # optionally with -j<num> for parallel build dpkg -i ../<name of package> make sure kernel is flashed, reboot, test the uncompressed and compressed versions Hopefully it reproduces the problem. So: 4. Test a known-good version: git checkout v3.1 make silentoldconfig; # reuse configuration make deb-pkg; # maybe with -j<num> dpkg -i ../<name of package> ... test as usual ... Hopefully it works fine. So let git know the result: git bisect start git bisect bad origin/master git bisect good v3.1 5. A version halfway between is automatically checked out to test: make silentoldconfig make deb-pkg; # maybe with -j<num> dpkg -i ../<name of package> ... test ... git bisect bad; # if it reproduces the problem git bisect good; # if compressed kernels work fine git bisect skip; # if some other problem makes it hard to test 6. Repeat step 5 until bored. Eventually it will spit out the "first bad commit", or if you get bored before that, you can run "git bisect log" to get a summary of the tests you have run, which is almost as good. If the gitk package is installed, you can run "git bisect visualize" at any step to watch the regression range narrowing.
From Ian Campbell, at <http://bugs.debian.org/658759>: > I suspect this is due to the lack of this u-boot patch: > http://lists.denx.de/pipermail/u-boot/2012-February/117020.html > > I found that without this my 3.2 dreamplug kernel would not boot (with > the 2011.12-2 package from debian). It's related to > CONFIG_ARM_PATCH_PHYS_VIRT. That would point to c1becedc8871 (ARM: enable ARM_PATCH_PHYS_VIRT by default, v3.2-rc1~189^2~1^6~2) as the first bad commit. One can check if this is the cause by disabling ARM_PATCH_PHYS_VIRT to see if that helps. Is there anything the kernel could do to continue to work with old (well, current today ;-)) versions of u-boot, too?
From Nico Pitre, at [1]: > You really do want to have uboot patched. Who knows what other latent > issues are there that you don't know about. I wonder if the kernel should read the extra features register at some appropriate moment and quietly disable L2 or panic with a hint that the bootloader has screwed up. This is very early in the boot sequence so it might be tricky. Hints for the novice: - enabling/disabling L2: arch/arm/mm/cache-feroceon-l2.c - booting a compressed kernel: arch/arm/boot/compressed/head.S [1] http://thread.gmane.org/gmane.linux.ports.arm.kernel/127951/focus=151172
I'm attaching my trouble here although I'm not sure whether it's the same thing. Sympton: Dockstar does not boot since the upgrade from 3.0-longterm to 3.4-longterm. No message after "Uncompressing Linux... done, booting the kernel.", not even on the serial console. Bisecting lead to v3.0-rc6-6-g3835d69: 3835d69a6c7048a28d0aea3cb8403d5e83a0f867 is the first bad commit commit 3835d69a6c7048a28d0aea3cb8403d5e83a0f867 Author: Russell King <rmk+kernel@arm.linux.org.uk> Date: Wed Jul 6 10:39:34 2011 +0100 ARM: vmlinux.lds: move init sections between text and data sections That is in contradiction to "3.1.10 works just fine on Kirkwood". Using an uncompressed image ("make Image") did not help either. More shutgun debugging (on 3.6-rc4): Disabling CONFIG_ARM_UNWIND - no avail. Disabling CONFIG_CACHE_FEROCEON_L2 lead to (...) CC init/version.o LD init/built-in.o arch/arm/mach-kirkwood/built-in.o: In function `kirkwood_l2_init': cpuidle.c:(.init.text+0x1d4): undefined reference to `feroceon_l2_init' make: *** [vmlinux] Error 1 And I'd happily disable ARM_PATCH_PHYS_VIRT at least for test but I have no idea what to enter for CONFIG_PHYS_OFFSET. So, I'm out of ideas at the moment. Do you have some more? The .config used to build at the guilty commit is attached.
Created attachment 79311 [details] .config
Hi Christoph, (In reply to comment #4) > I'm attaching my trouble here although I'm not sure whether it's the > same thing. Yes, it isn't. Could you file a separate bug (or even better, write to linux-arm-kernel@ directly and file a bug with a link to a mailing list archive with your message)? Thanks much, Jonathan
(In reply to comment #6) > (In reply to comment #4) > > I'm attaching my trouble here although I'm not sure whether it's the > > same thing. > > Yes, it isn't. Could you file a separate bug (or even better, write to > linux-arm-kernel@ directly and file a bug with [...] Sorry to make things complicated. Simpler, if you prefer: a message to linux-arm-kernel@lists.infradead.org, cc-ing me (jrnieder@gmail.com).
*** Bug 47071 has been marked as a duplicate of this bug. ***