I have SATA SSD drive connected via TUSB9261 converter. Eventualy i experience filesystem inconsistency when i repeatedly shutdown the device like this: echo s > /proc/sysrq-trigger; echo u > /proc/sysrq-trigger; echo o > /proc/sysrq-trigger; Same problem does not occur when i do this: echo s > /proc/sysrq-trigger; echo u > /proc/sysrq-trigger; for i in /sys/block/*/device/delete; do echo 1 > "$i"; done; echo o > /proc/sysrq-trigger; This tells me, sysrq power(o)ff does not shutdown the devices in correct order. It shuts down the USB hub before it sends SATA shutdown to the SSD, therefore failing to notify SSD to write its internal caches before PMIC shuts down the power completely.
On Tue, Oct 08, 2024 at 11:43:02AM +0000, bugzilla-daemon@kernel.org wrote: > I have SATA SSD drive connected via TUSB9261 converter. Eventualy i > experience > filesystem inconsistency when i repeatedly shutdown the device like this: > > echo s > /proc/sysrq-trigger; > echo u > /proc/sysrq-trigger; > echo o > /proc/sysrq-trigger; Don't do that, you will end up with an unstable system. Please always shutdown like normal. > Same problem does not occur when i do this: > > echo s > /proc/sysrq-trigger; > echo u > /proc/sysrq-trigger; > for i in /sys/block/*/device/delete; do echo 1 > "$i"; done; > echo o > /proc/sysrq-trigger; Yes, that is because you are shutting down things in the proper order, like a normal shutdown will do. sysrq is not for a normal shutdown, again, do not do that and expect to have data saved properly. good luck! greg k-h
> Don't do that, you will end up with an unstable system. It's an emergency shutdown in embedded scenario, we have capacitors that are able to sustain the system for 1 second in case of unexpected power loss. It's plenty of time to shutdown filesystems and drives properly. But we simply cannot afford to wait for userspace. Works nicely so far. > Yes, that is because you are shutting down things in the proper order, like a normal shutdown will do. Is there a reason for sysrq shutdown not to do that? I mean... c'mon... shutting down SSDs cleanly is the bare minimum we should do... if we fail to do that, we can as well just unplug the power without trying to unmount or sync anything...
On Tue, Oct 08, 2024 at 12:00:33PM +0000, bugzilla-daemon@kernel.org wrote: > It's an emergency shutdown in embedded scenario, we have capacitors that are > able to sustain the system for 1 second in case of unexpected power loss. > It's > plenty of time to shutdown filesystems and drives properly. But we simply > cannot afford to wait for userspace. Works nicely so far. "so far" seems to mean "data loss" so I doubt that :) > > Yes, that is because you are shutting down things in the proper order, > like a normal shutdown will do. > > Is there a reason for sysrq shutdown not to do that? I mean... c'mon... > shutting down SSDs cleanly is the bare minimum we should do... if we fail to > do > that, we can as well just unplug the power without trying to unmount or sync > anything... Because the sysrq code does NOT go through the normal device shutdown process, you are circumventing it, because it is only for emergencies. Don't rely on it for a normal shutdown, as that is not what it is designed for at all. good luck! greg k-h
Anyway... My coworker says he triggered the same error by proper shutdown. He gets sata command fail somewhere near the end of the log when shutting down.
On 08.10.24 14:22, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=219362 > > --- Comment #4 from Tomas Mudrunka (harviecz@gmail.com) --- > Anyway... My coworker says he triggered the same error by proper shutdown. He > gets sata command fail somewhere near the end of the log when shutting down. > Then please provide full dmesg. We need to know what is happening. You should definitely get a cache flush command during the shotdown. Regards Olver
The 'u' sysrq code attempts to remount all mounted filesystems read-only. Shouldn't that send a synchronize-cache command to the SSD? If it doesn't, there's something wrong. It's also possible that Tomas is sending the 'o' poweroff sysrq code too soon after the 'u' code, leaving the system insufficient time to flush the cache before it powers off. Tomas, the 'o' code does not shut down devices at all, let alone give them a chance to clean up first. As Greg said, it is meant for emergency use. All it does is turn off the system power -- it does not perform a clean shutdown.
what is the point of having 'u' and 's' keys in sysrq to sync filesystems, when we're not able to sync controler of underlying harddrive?
And it's not just about "sync" of the harddrive. It's about stopping it wear leveling operations. Modern storages DO shuffle data between blocks even when you don't write anything to them, power loss will cause data loss even on read-only systems. (users of raspberry pi with read-only SD cards can tell you lots of stories on this topic). Therefore we need to signal the device in advance we're going to remove its power.
The sysrq commands were originally implemented before drive controllers for PCs contained any significant state. Maybe they need some updating. More generally, these questions should be posed to the SCSI maintainers, because it is the SCSI subsystem which is responsible for sending the commands you're talking about. The USB driver does nothing more than relay the commands and their results between the SCSI subsystem and the storage device.
Would it be appropriate solution to add new sysrq command to call device_shutdown(). defined in drivers/base/core.c or perhaps call it directly in 'o' and 'b' sysrq handlers?
On Wed, Oct 09, 2024 at 10:08:41AM +0000, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=219362 > > --- Comment #10 from Tomas Mudrunka (harviecz@gmail.com) --- > Would it be appropriate solution to add new sysrq command to call > device_shutdown(). > defined in drivers/base/core.c That is the normal shutdown path (i.e. the syscall), please use that instead if you wish to have devices shutdown correctly. sysrq sync and shutdown is for extraordinary situations where the system is broken and you are attempting to shut it down in a way that is better than just pulling the power at that point in time. Don't attempt to turn it into something else. thanks, greg k-h
But i want to be 100% sure no userspace will be involved in the shutdown. If i do proper shutdown the init will be doing stuff and i cannot achieve the shutdown on time before the backup capacitors discharge...
For example manapage for "poweroff -f" says following: -f, --force Force immediate power-off, halt, or reboot. If specified, the command does not contact the init system. In most cases, filesystems are not properly unmounted before shutdown. For example, the command reboot -f is mostly equivalent to systemctl reboot -ff, instead of systemctl reboot -f. when compared to sysrq... sysrq can at least unmount the filesystems...
Also weird thing is that when i do following: echo s > /proc/sysrq-trigger echo u > /proc/sysrq-trigger echo o > /proc/sysrq-trigger i see message printed by watchdog driver .shutdown callback. which means devices are shut down after all... unless there is some special case that only shuts down watchdog drivers during sysrq poweroff...
Ok, i've just went through elixir and found out that echo o > /proc/sysrq-trigger actualy leads to kernel_power_off() which is function described as "Shutdown everything and perform a clean system power_off." and which then calls kernel_shutdown_prepare() which calls device_shutdown() weirdly enough this is only true for 'o' and not for 'b', because 'o' handler is not pre-registered in sysrq.c (unlike 'b') and is additionaly registered by pm_sysrq_init() in kernel/power/poweroff.c. which seems bit inconsistent at least, but i am not aware of reasons behind this design.
Now you're discussing issues involving power management. These are not appropriate for either the USB or the SCSI people; you should contact the linux-pm@vger.kernel.org mailing list about these questions.
These operations are asynchronous and they don't even seem to be synchronized against each other. This is what I got from s followed u: [ 9oct 19:21] sysrq: Emergency Sync [ +0,000028] sysrq: Emergency Remount R/O [ +0,019142] EXT4-fs (sda2): re-mounted ef717e3d-539d-4988-9a02-ca541d643379 ro. Quota mode: disabled. [ +0,002201] Emergency Sync complete --- some long error message here Emergency Remount complete If subsequent o would execute before the u is done, the u would only increase risk of data loss by making the disk busy right before powerdown and by exercising the remount code against a concurrent loss of power, which might be a neglected case due to its rarity. You will never know whom to blame without hours of debugging. Myself, I would try sync ; poweroff -f
But sync followed by poweroff -f does not unmount the filesystems...