Bug 203247 - jbd2 gets stuck at jbd2_journal_commit_transaction frequently
Summary: jbd2 gets stuck at jbd2_journal_commit_transaction frequently
Status: RESOLVED INVALID
Alias: None
Product: File System
Classification: Unclassified
Component: ext3 (show other bugs)
Hardware: All Linux
: P1 high
Assignee: fs_ext3@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-10 16:15 UTC by pkb07815
Modified: 2019-04-15 21:09 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.9.86
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg file (58.70 KB, text/plain)
2019-04-10 16:15 UTC, pkb07815
Details
kern log (675.45 KB, text/plain)
2019-04-10 16:25 UTC, pkb07815
Details
kern log (3.91 MB, text/plain)
2019-04-10 16:26 UTC, pkb07815
Details
kern log (3.91 MB, text/plain)
2019-04-10 16:27 UTC, pkb07815
Details
iostat (12.31 KB, text/plain)
2019-04-10 18:06 UTC, pkb07815
Details

Description pkb07815 2019-04-10 16:15:14 UTC
Created attachment 282263 [details]
dmesg file

I am noticing jbd2 getting blocked on commit_transaction causing processes to go in D state for good amount of time, it comes out of that state automatically. I am dumping the sysrq-w and sysrq-l 

Apr 10 15:47:23 ex-etp3 kernel:sysrq: SysRq : Show Blocked State
Apr 10 15:47:23 ex-etp3 kernel:  task                        PC stack   pid father
Apr 10 15:47:23 ex-etp3 kernel:jbd2/sda10-8    D    0  2573      2 0x00000000
Apr 10 15:47:23 ex-etp3 kernel: ffff8818347c97c0 0000000000000000 ffff88101c9d0140 ffff8817ff898080
Apr 10 15:47:23 ex-etp3 kernel: ffff88101fd17e40 ffff8817fec47b38 ffffffff816bcf06 ffff8807ff66b040
Apr 10 15:47:23 ex-etp3 kernel: ffff8817ff898080 7fffffffffffffff ffffffff81c06930 ffffffff816bdae0
Apr 10 15:47:23 ex-etp3 kernel:Call Trace:
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816bcf06>] ? __schedule+0x1b6/0x570
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816bdae0>] ? bit_wait+0x50/0x50
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816bd2f4>] ? schedule+0x34/0x80
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816bfd25>] ? schedule_timeout+0x165/0x290
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff81264366>] ? generic_make_request+0x116/0x290
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816bdae0>] ? bit_wait+0x50/0x50
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816bcce9>] ? io_schedule_timeout+0x99/0x100
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816bdaf2>] ? bit_wait_io+0x12/0x60
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816bd725>] ? __wait_on_bit+0x55/0x80
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816bdae0>] ? bit_wait+0x50/0x50
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816bd919>] ? out_of_line_wait_on_bit+0x79/0xa0
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff8109d970>] ? autoremove_wake_function+0x30/0x30
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff8122301f>] ? jbd2_journal_commit_transaction+0x198f/0x1b00
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff81085b41>] ? finish_task_switch+0x81/0x1f0
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff81226150>] ? kjournald2+0xc0/0x260
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff8109d940>] ? wake_up_bit+0x20/0x20
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff81226090>] ? commit_timeout+0x10/0x10
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff81080922>] ? kthread+0xd2/0xf0
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816c5645>] ? __clear_rsb+0x25/0x3d
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff81080850>] ? kthread_park+0x50/0x50
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff81068beb>] ? SyS_exit_group+0xb/0x10
Apr 10 15:47:23 ex-etp3 kernel: [<ffffffff816c0ff4>] ? ret_from_fork+0x44/0x70
Comment 1 pkb07815 2019-04-10 16:25:05 UTC
Created attachment 282265 [details]
kern log
Comment 2 pkb07815 2019-04-10 16:26:38 UTC
Created attachment 282267 [details]
kern log
Comment 3 pkb07815 2019-04-10 16:27:19 UTC
Created attachment 282269 [details]
kern log
Comment 4 Theodore Tso 2019-04-10 16:50:40 UTC
What is your hardware configuration --- in particular, what sort of HDD or SSD are you using?   And what are you running on the system?

This looks very much like I/O is just getting very badly bottlenecked.   When did this start?   If it was OK on an older kernel version, which one?   Are you using some Linux distribution?
Comment 5 pkb07815 2019-04-10 17:57:45 UTC
Here is the information:

# show raid
Raid configuration:
  Model:            LSI MegaRAID SAS 9260-4i
  Serial:           SK72468112
  Firmware Version: 2.130.383-2315

Apr  9 00:44:22 ex-etp3 kernel:scsi host0: Avago SAS based MegaRAID driver
Apr  9 00:44:22 ex-etp3 kernel:scsi 0:2:0:0: Direct-Access     LSI      MR9260-4i        2.13 PQ: 0 ANSI: 5

Apr  9 00:44:22 ex-etp3 kernel:sd 0:2:0:0: [sda] 1171062784 512-byte logical blocks: (600 GB/558 GiB)
Apr  9 00:44:22 ex-etp3 kernel:sd 0:2:0:0: Attached scsi generic sg0 type 0
Apr  9 00:44:22 ex-etp3 kernel:sd 0:2:0:0: [sda] Write Protect is off
Apr  9 00:44:22 ex-etp3 kernel:sd 0:2:0:0: [sda] Mode Sense: 1f 00 10 08
Apr  9 00:44:22 ex-etp3 kernel:sd 0:2:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA

[admin@ex-etp3 ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda5       2.0G  1.8G  106M  95% /
tmpfs            63G  3.0G   60G   5% /dev/shm
/dev/sda2       120M   30M   84M  27% /boot
/dev/sda1       119M  1.8M  111M   2% /bootmgr
/dev/sda8       182M   50M  123M  29% /config
/dev/sda9        30G   22G  6.3G  78% /var
/dev/sda10      453G  329G  102G  77% /data
tmpfs            63G   24K   63G   1% /run
tmpfs            64M  980K   64M   2% /vtmp
[admin@ex-etp3 ~]# mount
/dev/sda5 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda2 on /boot type ext3 (ro,noatime,noexec)
/dev/sda1 on /bootmgr type ext3 (ro,noatime,noexec)
/dev/sda8 on /config type ext3 (rw,noatime,noexec)
/dev/sda9 on /var type ext3 (rw,noatime)
/dev/sda10 on /data type ext3 (rw,noatime,barrier=0)
tmpfs on /dev/shm type tmpfs (rw)
tmpfs on /run type tmpfs (rw,nosuid,nodev,strictatime,mode=0755)
tmpfs on /vtmp type tmpfs (rw,size=64M)




 description: Motherboard
       product: H8DGU
       vendor: Supermicro
       physical id: 0
       version: 1234567890
       serial: NM179S600489
       slot: To Be Filled By O.E.M.
     *-firmware
          description: BIOS
          vendor: American Megatrends Inc.
          physical id: 0
          version: FI3.0c
          date: 12/31/2013

cpu:0
          description: CPU
          product: (To Be Filled By O.E.M.)
          vendor: Advanced Micro Devices [AMD]
          physical id: 4
          bus info: cpu@0
          version: AMD Opteron(tm) Processor 6380

 *-storage
                description: RAID bus controller
                product: MegaRAID SAS 2108 [Liberator]
                vendor: LSI Logic / Symbios Logic
                physical id: 0
                bus info: pci@0000:02:00.0
                logical name: scsi0
                version: 05
                width: 64 bits
                clock: 33MHz
                capabilities: storage pm pciexpress vpd msi msix bus_master cap_list rom
                configuration: driver=megaraid_sas latency=0
                resources: irq:19 ioport:c000(size=256) memory:fd59c000-fd59ffff memory:fd5c0000-fd5fffff memory:fd5a0000-fd5bffff
              *-disk
                   description: SCSI Disk
                   product: MR9260-4i
                   vendor: LSI
                   physical id: 2.0.0
                   bus info: scsi@0:2.0.0
                   logical name: /dev/sda
                   version: 2.13
                   serial: 003307152fc3005222f0e91a0db00506
                   size: 558GiB (599GB)


kernel:Linux version 4.9.86 (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP PREEMPT


the system has lot of IO traffic, this happens intermittently. We are seeing issue mainly after upgrading from 4.1 to 4.9.75 and 4.9.86.
Comment 6 pkb07815 2019-04-10 18:01:57 UTC
I have tried with noop, CFQ and deadline, same results.

Io is mainly related to /data partition, which is mounted as :

/dev/sda9 on /var type ext3 (rw,noatime)
/dev/sda10 on /data type ext3 (rw,noatime,barrier=0)

/var mainly has logging part. We have lot of db traffic (postgres) flowing under /data.
I have also tried writeback instead of ordered (default) for /data.
Comment 7 pkb07815 2019-04-10 18:06:39 UTC
Created attachment 282273 [details]
iostat
Comment 8 pkb07815 2019-04-10 18:21:54 UTC
issue is reproducible:

I have use dd command to generate file in /data area to compete with the postgres operation and using iostat -x to monitor the io condition.

Here is how I create the dd with variation of count value and sleep time:
for i in {1..100}; do nice -n 10 dd if=/dev/zero of=/data/tmp/testonly/tmpfile$i.txt count=1024 bs=1048576; echo "done tmpfile $i"; sleep 10; done &

I use postgresql utility pgbench paired with other in-house sql commands for test of transactions: 
su -s /bin/bash postgres -c "/usr/postgresql/bin/pgbench -T 90000 -c 8 bench2" &

When I execute the pgbench and dd both..I see the processes going into D state and stuck there for long (atleast for 140 seconds).
Comment 9 pkb07815 2019-04-12 19:33:56 UTC
Here is the complete information about disk:

# /opt/MegaRAID/MegaCli/MegaCli64 -pdlist -aALL
                                     
Adapter #0

Enclosure Device ID: 252
Slot Number: 0
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 6
WWN: 5000C500A167B624
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.406 GB [0x45cd0000 Sectors]
Sector Size:  0
Logical Sector Size:  0
Physical Sector Size:  0
Firmware state: Online, Spun Up
Commissioned Spare : No
Emergency Spare : No
Device Firmware Level: N003
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c500a167b625
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST600MM0088     N003W420R7PJ            
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: Unknown 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive:  Not Certified
Drive Temperature :28C (82.40 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : No
Comment 10 pkb07815 2019-04-15 17:15:26 UTC
(In reply to Theodore Tso from comment #4)
> What is your hardware configuration --- in particular, what sort of HDD or
> SSD are you using?   And what are you running on the system?
> 
> This looks very much like I/O is just getting very badly bottlenecked.  
> When did this start?   If it was OK on an older kernel version, which one?  
> Are you using some Linux distribution?

Any update or suggestions please?
Comment 11 Theodore Tso 2019-04-15 21:09:49 UTC
Looking at your iostat -x output, /dev/sda is completely utilized:



Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s 
sda               0.00  1154.00   47.00  836.00   864.00 10076.00    

avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
24.78    136.46     116.05  331.72  103.93   1.13 100.00

Note the 100% utilization and the average queue size of 136.46 I/O requests.   So as I said, this sounds like I/O overload.   Your workload is simply issuing more I/O than the disk can handle.   Why is going to be a userspace issue.

If you don't care about stale data (e.g., belonging to deleted files, which may have privacy issues if you care about GDPR, HIPPA, PCI, etc. mandates) you could try mounting with the mount option "data=writeback".  It could be that updating to a newer version of the kernel (4.9.86 is quite antique, and you're probably missing all sorts of security and otehr bug fixes) and a newer userspace, and then setting up a new set of file systems formatted with ext4 will also help performance.   Or there might be something going on with userspace writes to /dev/sda that shouldn't be happening or which could be optimized away.

But I don't see anything here which is a kernel bug.

Note You need to log in before you can comment on or make changes to this bug.