Created attachment 144001 [details] trace photo Third time in two days my kernel panics while issuing the following command as user: dd if=/dev/urandom of=myfile bs=100M count=1 Everytime the trace looks more or less like the one attached. Sorry if this is not cpuidle related, it's just a guess on my side based on the trace.
Created attachment 144011 [details] better picture Apparently this is very reproducible. In fact it's just crashed again...
Could you please retry with kernel build from linux-next repo? There are recent changes regarding /dev/*random. git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
So i've done a quick test with 5eb00b037d9bb650b18b8f331bb9fb7a66559b5f (quickly rebuilt with make oldconfig and all the defaults accepted) and it does not panic. However it also looks terribly broken in that /dev/urandom returns EOF after exactly 33554431 bytes (tried 10 times and happened every single time): $ dd if=/dev/urandom of=myfile bs=100M count=1 0+1 records in 0+1 records out 33554431 bytes (34 MB) copied, 2.0128 s, 16.7 MB/s On a side note kenrel 3.14.0 does not panic when dd reads from urandom.
33MB EOF reproduced with 15ba2236f3556fc01b9ca91394465152b5ea74b6 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
I believe the reason is commit 79a8468747c5f95ed3d5ce8376a3e82e0c5857fc , the chunk @@ -1376,6 +1386,7 @@ urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos) "with %d bits of entropy available\n", current->comm, nonblocking_pool.entropy_total); + nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3)); ret = extract_entropy_user(&nonblocking_pool, buf, nbytes); trace_urandom_read(8 * nbytes, ENTROPY_BITS(&nonblocking_pool), which is likely what is described in the commit message as "additional paranoia check to prevent overly large count values to be passed into urandom_read()"
Emailed into LKML, Theodore & Hannes Frederic, see with subject "Reading large amounts from /dev/urandom broken".
The bug should be fixed if you cherry-pick commit 79a8468747c5f95ed3d5ce8376a3e82e0c5857fc into 3.15.4+. Please let me know if it doesn't. Why on *earth* are you using /dev/urandom in this way? The nbytes restriction is to prevent an accounting failure since we are now tracking entropy in fractional bits, so when we convert bytes requested into fractional bits, we overflow if someone tries to request more than 32MB. Given that no sane use of /dev/urandom needs more than 256 bytes, this was considered acceptable.
(In reply to Theodore Tso from comment #7) > The bug should be fixed if you cherry-pick commit > 79a8468747c5f95ed3d5ce8376a3e82e0c5857fc into 3.15.4+. Please let me know > if it doesn't. I confirm that appying 79a8468747c5f95ed3d5ce8376a3e82e0c5857fc onto v3.15.4 indeed fixes the ooops. Thanks for the quick reply. > Why on *earth* are you using /dev/urandom in this way? Because I love doing stupid things. Because I can do stupid things. And despite being stupid neither should be a valid reason to get a kernel panic, which is why I've opened this bug. Applying 79a8468747c5f95ed3d5ce8376a3e82e0c5857fc forbids me from doing stupid things, which is how it fixes this bug (apologies for calling it a broken behaviour, I see it's intended). > The nbytes > restriction is to prevent an accounting failure since we are now tracking > entropy in fractional bits, so when we convert bytes requested into > fractional bits, we overflow if someone tries to request more than 32MB. Thanks for the explanation. > Given that no sane use of /dev/urandom needs more than 256 bytes, this was > considered acceptable. It certainly is, if it's properly enforced. Thanks again!
Well, read(2) is always allowed to return a short read. So the change which is currently queued for 3.16 doesn't actually break anything. The fact that dd treats a short read as an EOF is dd's problem. Any C program that reads from /dev/urandom must always check for short reads, since if a signal interupts the read, you can get the short read. The only difference is that now, if you try to give a length > 32MB, you are guaranteed to get a short read.
If dd doesn't check for short reads... :-)
Short reads are absolutely fine, it was me incorrectly assunming an EOF rather than dd failing to handle them properly. In fact I've just found out that dd requires an extra "iflag=fullblock" option in order to do so. Thanks again.
"Don't break userspace" doctrine says that short reads aren't fine.
If you do something like dd if=/dev/urandom of=/dev/sdc bs=64M, it doesn't stop after 32MB. The fact that we are now returning short reads means that certain very implementation behaviour changes will be *different*, but *different* is not the same as *breakage*.
(In reply to Theodore Tso from comment #9) > Well, read(2) is always allowed to return a short read. So the change which > is currently queued for 3.16 doesn't actually break anything. I'm able to crash my desktop system (32bit x86 Gentoo stable with vanilla kernel 3.1.5.6) with this command: $> dd if=/dev/urandom of=/dev/zero bs=67108707 as a local, unprivileged user. Does the mentioned change prevent this crash ?
Toralf, make sure you pulled 79a8468747c5f95ed3d5ce8376a3e82e0c5857fc "random: check for increase of entropy_count because of signed conversion" it fixes crash
There are two separate things being discussed in this bugzilla. The first is the kernel BUG, which is fixed by 79a8468747c5, which has been accepted by Linus and cc'ed to stable@kernel.org, so it should eventually show up in a 3.15.x kernel. The other discussion is a complaint that commit 79a8468747c5 causes reads larger than 32MB results in a only 32MB to be returned by the read(2) system call. That is, it results in a short read. POSIX always allows for a short read(2), and any program MUST check for short reads. The problem with dd is that POSIX requires the count=X parameter, to be based on reads, not on bytes. This can be changed with iflag=fullblock. There is no legitimate use of /dev/urandom and dd, so I'm not particularly worried about small implementation-level behavioural changes, especially when all of this is POSIX complaint, and any sane program (a) wouldn't be reading more than 512 bytes (and usually, only 16 or 32 bytes is plenty), and (b) any competent program will check the return value of read(2) and deal with short reads. If you want to erase a disk, using nwipe is much faster. If you want to securely overwrite a file, the right tool to use is wipe.
(In reply to Alexey Dobriyan from comment #15) > Toralf, make sure you pulled 79a8468747c5f95ed3d5ce8376a3e82e0c5857fc > "random: check for increase of entropy_count because of signed conversion" > > it fixes crash thx - applied on top of 3.15.6 - works fine
*** Bug 80991 has been marked as a duplicate of this bug. ***