Bug 13033 - bonding leads to iscsi initiator error
Summary: bonding leads to iscsi initiator error
Status: CLOSED OBSOLETE
Alias: None
Product: Networking
Classification: Unclassified
Component: IPV4 (show other bugs)
Hardware: i386 Linux
: P1 high
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-07 02:48 UTC by zhanghj
Modified: 2012-05-30 15:17 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.24.7
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description zhanghj 2009-04-07 02:48:46 UTC
i have a machine with 4  nic (intel e1000 series),  after bonding with mode 6 (balance-alb), 
then  i start to run iscsi-target service, the target software is iscsi enterprise target (version 0.4.16),
the iet config file like this:

MaxRecvDataSegmentLength 262144
MaxXmitDataSegmentLength 262144
MaxBurstLength          16776192
FirstBurstLength        262144
Target target.test.229
HeaderDigest            None
DataDigest              None
Lun 0 Path=/dev/vda/p1,Type=blockio,IOMode=wb
Target target.test.230
HeaderDigest            None
DataDigest              None
Lun 0 Path=/dev/vda/p2,Type=blockio,IOMode=wb
Target target.test.232
HeaderDigest            None
DataDigest              None
Lun 0 Path=/dev/vdb/p1,Type=blockio,IOMode=wb
Target target.test.233
HeaderDigest            None
DataDigest              None
Lun 0 Path=/dev/vdb/p2,Type=blockio,IOMode=wb

in the config file, i add four target.
/dev/vda/p1|p2,/dev/vdb/p1|p2 is logical volume  created by lvm2.

After the iscsi-target running normally,  four  windows clients  connect to the  iscsi-target, and each client logon a
diffrent target.
After logon,  every client can detect an iscsi virtual disk in the windows disk-management service, then i create a 
disk partition on each iscsi virtual disk, and format with NTFS.
After this i begin to run iometer on the new partition to test the iscsi performace, every client like this.

after a few minutes, iometer begin to report error on two client, even  the iscsi virtual disk disappear.
i run dmesg command  on the  iscsi-target, get  message like this:

iscsi_trgt: Logical Unit Reset (05) issued on tid:2 lun:0 by sid:844425852944448 (Function Complete)
iscsi_trgt: check_cmd_sn(537) sequence error (171f25,171f46)
iscsi_trgt: cmnd_rx_start(1690) 2 171f47 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f47 2 0 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f25,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f48 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f48 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f26,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f49 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f49 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f27,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f4a -4
iscsi_trgt: cmnd_skip_pdu(454) 171f4a 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f28,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f4b -4
iscsi_trgt: cmnd_skip_pdu(454) 171f4b 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f29,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f4c -4
iscsi_trgt: cmnd_skip_pdu(454) 171f4c 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f2a,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f4d -4
iscsi_trgt: cmnd_skip_pdu(454) 171f4d 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f2b,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f4e -4
iscsi_trgt: cmnd_skip_pdu(454) 171f4e 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f2c,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f4f -4
iscsi_trgt: cmnd_skip_pdu(454) 171f4f 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f2d,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f50 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f50 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f2e,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f51 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f51 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f2f,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f52 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f52 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f30,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f53 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f53 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f31,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f54 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f54 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f32,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f55 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f55 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f33,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f56 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f56 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f34,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f57 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f57 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f35,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f58 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f58 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f36,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f59 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f59 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f37,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f5a -4
iscsi_trgt: cmnd_skip_pdu(454) 171f5a 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f38,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f5b -4
iscsi_trgt: cmnd_skip_pdu(454) 171f5b 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f39,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f5c -4
iscsi_trgt: cmnd_skip_pdu(454) 171f5c 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f3a,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f5d -4
iscsi_trgt: cmnd_skip_pdu(454) 171f5d 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f3b,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f5e -4
iscsi_trgt: cmnd_skip_pdu(454) 171f5e 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f3c,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f5f -4
iscsi_trgt: cmnd_skip_pdu(454) 171f5f 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f3d,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f60 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f60 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f3e,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f61 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f61 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f3f,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f62 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f62 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f40,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f63 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f63 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f41,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f64 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f64 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f42,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f65 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f65 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f43,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f66 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f66 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f44,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f67 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f67 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f45,171f46)
iscsi_trgt: cmnd_rx_start(1690) 2 171f68 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f68 2 0 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f45,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f69 -4
iscsi_trgt: cmnd_skip_pdu(454) 171f69 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f25,171f46)
iscsi_trgt: cmnd_rx_start(1690) 2 171f6a -4
iscsi_trgt: cmnd_skip_pdu(454) 171f6a 2 0 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f25,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f6b -4
iscsi_trgt: cmnd_skip_pdu(454) 171f6b 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f26,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f6c -4
iscsi_trgt: cmnd_skip_pdu(454) 171f6c 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f27,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f6d -4
iscsi_trgt: cmnd_skip_pdu(454) 171f6d 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f28,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f6e -4
iscsi_trgt: cmnd_skip_pdu(454) 171f6e 1 28 0
iscsi_trgt: check_cmd_sn(537) sequence error (171f29,171f46)
iscsi_trgt: cmnd_rx_start(1690) 1 171f6f -4
iscsi_trgt: cmnd_skip_pdu(454) 171f6f 1 28 0
Comment 1 zhanghj 2009-04-07 03:00:03 UTC
i change the bonding mode to mode=0(balance-rr) and mode=5(balance-tlb),
have the same problem, but change to mode=2(balance-xor) the iometer can run 
normally.
Comment 2 Andrew Morton 2009-04-10 21:08:31 UTC
I dicussed this with Mike Christie <michaelc@cs.wisc.edu> and
 Jay Vosburgh <fubar@us.ibm.com>.

Mike said:

I think they are using a microsoft initiator/host with a non-upstream 
iscsi target/server, so I am not 100% sure. I do not work on either 
project so I am just guessing below :)

 From the target logs in the bugzilla, it looks like there was probably 
disruption with the connection or a command took too long. Then the 
windows initiator probably tried to run some error handling (lun reset). 
 From there it looks like the target and initiator could not agree on 
what was the proper next step (initiator might have sent data when it 
should not have). The initiator eventually gave up and removed the disks 
thinking they were bad or because it could not figure out anything else 
to try.

The bug report should go to
iscsitarget-devel@lists.sourceforge.net
They maintain the IET iscsi target, and can better debug problems 
against the microsoft initiatior since they have people that know how 
that works.

Note You need to log in before you can comment on or make changes to this bug.