Bug 195729
Summary: | JBD2: Spotted dirty metadata buffer (dev = sda1, blocknr = 1766784).There's a risk of filesystem corruption in case of system crash. | ||
---|---|---|---|
Product: | File System | Reporter: | jqiaoulk |
Component: | ext4 | Assignee: | fs_ext4 (fs_ext4) |
Status: | RESOLVED INVALID | ||
Severity: | normal | ||
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.10.27 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
jqiaoulk
2017-05-12 13:31:46 UTC
It tells us that clk_set_parent() will potentially goes to sleep under some circumstances(in this case, to acquire the mutex lock). Therefore,with regard to the Driver caller routine shown below, it must incorrect runs in critical section, which possible call spin_lock() or other related functions which disable the preemption. Since it is not in the IRQ context, it is very likely spink lock is called in this failure case. Looking into the Driver code, should give the clue. [Fri May 05 05:26:09.992 2017] [40636.613352] [<7eea79ec>] (vibe_os_clk_set_parent+0x34/0xb0 [vibe_os]) from [<7eec3cd0>] (_ZN12CSTmClockLLA6EnableE29stm_clk_divider_output_name_t31stm_clk_divider_output_source_t31stm_clk_divider_output_divide_t+0xd4/0xf0 [stmcore_display_stiH407]) [Fri May 05 05:26:10.017 2017] [40636.635408] [<7eec3cd0>] (_ZN12CSTmClockLLA6EnableE29stm_clk_divider_output_name_t31stm_clk_divider_output_source_t31stm_clk_divider_output_divide_t+0xd4/0xf0 [stmcore_display_stiH407]) from [<7eec8c2c>] (_ZN16CSTmMainTVOutput15SetOutputFormatEj+0xc8/0x1b0 [stmcore_display_stiH407]) [Fri May 05 05:26:10.036 2017] [40636.660506] [<7eec8c2c>] (_ZN16CSTmMainTVOutput15SetOutputFormatEj+0xc8/0x1b0 [stmcore_display_stiH407]) from [<7eebbd40>] (_ZN16CSTmMasterOutput5StartEPK18stm_display_mode_s+0x1b8/0x1f8 [stmcore_display_stiH407]) [Fri May 05 05:26:10.069 2017] [40636.679522] [<7eebbd40>] (_ZN16CSTmMasterOutput5StartEPK18stm_display_mode_s+0x1b8/0x1f8 [stmcore_display_stiH407]) from [<7eeba0ac>] (stm_display_output_start+0x6c/0xb4 [stmcore_display_stiH407]) [Fri May 05 05:26:10.069 2017] [40636.697123] [<7eeba0ac>] (stm_display_output_start+0x6c/0xb4 [stmcore_display_stiH407]) from [<7f444650>] (dil_blender_output_start+0x8c/0x17c [stm_cdi_dvb]) [Fri May 05 05:26:10.079 2017] [40636.711321] [<7f444650>] (dil_blender_output_start+0x8c/0x17c [stm_cdi_dvb]) from [<7f45717c>] (cdi_hdmi_exit+0xb64/0xbc4 [stm_cdi_dvb]) [Fri May 05 05:26:10.092 2017] [40636.723660] [<7f45717c>] (cdi_hdmi_exit+0xb64/0xbc4 [stm_cdi_dvb]) from [<7f4578a4>] (dil_hdmi_set_video_code+0x4c/0x1d4 [stm_cdi_dvb]) [Fri May 05 05:26:10.104 2017] [40636.735912] [<7f4578a4>] (dil_hdmi_set_video_code+0x4c/0x1d4 [stm_cdi_dvb]) from [<7f44ed74>] (hdmi_release+0x316c/0x781c [stm_cdi_dvb]) [Fri May 05 05:26:10.151 2017] [40636.748251] [<7f44ed74>] (hdmi_release+0x316c/0x781c [stm_cdi_dvb]) from [<7f4539a0>] (hdmi_ioctl+0x57c/0x2d40 [stm_cdi_dvb]) [Fri May 05 05:26:10.152 2017] [40636.759631] [<7f4539a0>] (hdmi_ioctl+0x57c/0x2d40 [stm_cdi_dvb]) from [<800fd4dc>] (vfs_ioctl+0x28/0x3c) [Fri May 05 05:26:10.152 2017] [40636.769137] [<800fd4dc>] (vfs_ioctl+0x28/0x3c) from [<800fdf48>] (do_vfs_ioctl+0x4b0/0x560) [Fri May 05 05:26:10.152 2017] [40636.777484] [<800fdf48>] (do_vfs_ioctl+0x4b0/0x560) from [<800fe02c>] (SyS_ioctl+0x34/0x60) [Fri May 05 05:26:10.152 2017] [40636.785836] [<800fe02c>] (SyS_ioctl+0x34/0x60) from [<8000dbc0>] (ret_fast_syscall+0x0/0x30) So briefly speaking, one following call must incorrect call spin_lock or other functions that disable the preemption in this failure case: For example: Failure case: stm_display_output_start() spin_lock() _ZN16CSTmMasterOutput5StartEPK18stm_display_mode_s -------> We can't sleep here since it is in the critical section !!! spin_unlock() Call Sequence: (1) vibe_os_clk_set_parent (2) _ZN12CSTmClockLLA6EnableE29stm_clk_divider_output_name_t31stm_clk_divider_output_source_t31stm_clk_divider_output_divide_t (3) _ZN16CSTmMainTVOutput15SetOutputFormatEj (4) _ZN16CSTmMasterOutput5StartEPK18stm_display_mode_s (5) stm_display_output_start() (6) dil_blender_output_start (7) cdi_hdmi_exit (8) dil_hdmi_set_video_code (9) hdmi_release (10) hdmi_ioctl() This is the root cause, which is pretty straightforward: The spin lock is firstly acquired at the following function: OutputResults CSTmMasterOutput::Start(const stm_display_mode_t* pModeLine) { vibe_os_lock_resource(m_lock); if(!this->SetOutputFormat(m_ulOutputFormat)) { TRC; goto stop_and_exit; } vibe_os_unlock_resource(m_lock); } and this->SetOutputFormat will eventually call vibe_os_clk_set_parent() as below: bool CSTmClockLLA::Enable { vibe_os_clk_set_parent(output, source); } Based on the callstack, we know vibe_os_clk_set_parent will call clk_set_parent(). clk_set_parent() is a basic kernel API that would potentially go to sleep, based on the call stack in the log. Therefore, CSTmMasterOutput::Start would potentially go to sleeps while holding spin_Lock(). ST needs to review the code and decide which is the best place to place spin_lock or change to mutex. Oh...Sorry, Comment post to wrong place, I will close this ticket and create a new one... |